Life Cycle Phases of Data Analytics

Aditya Dwivedi
2 min readJun 15, 2022

In this article, I am going to discuss the life cycle phases of data analytics in which we will cover various life cycle phases and will discuss them one by one.

Data Analytics Lifecycle :
The Data analytic lifecycle is designed for Big Data problems and data science projects. The cycle is iterative to represent real project. To address the distinct requirements for performing analysis on Big Data, step—by—step methodology is needed to organize the activities and tasks involved with acquiring, processing, analyzing, and repurposing data.

  • The data science team learns and investigates the problem.
  • Develop context and understanding.
  • Come to know about data sources needed and available for the project.
  • The team formulates an initial hypothesis that can be later tested with data.
  • Steps to explore, preprocess, and condition data prior to modeling and analysis.
  • It requires the presence of an analytic sandbox, the team execute, load, and transform, to get data into the sandbox.
  • Data preparation tasks are likely to be performed multiple times and not in predefined order.
  • Several tools commonly used for this phase are — Hadoop, Alpine Miner, Open Refine, etc.
  • Team explores data to learn about relationships between variables and subsequently, selects key variables and the most suitable models.
  • In this phase, data science team develops data sets for training, testing, and production purposes.
  • Team builds and executes models based on the work done in the model planning phase.
  • Several tools commonly used for this phase are — Matlab, STASTICA, Congnos.
  • Team develops datasets for testing, training, and production purposes.
  • Team also considers whether its existing tools will suffice for running the models or if they need more robust environment for executing models.
  • Free or open-source tools — Jupyter, Rand PL/R, Octave, WEKA, Tableau Public.
  • Commercial tools — Matlab , STASTICA.
  • After executing model team need to compare outcomes of modeling to criteria established for success and failure.
  • Team considers how best to articulate findings and outcomes to various team members and stakeholders, taking into account warning, assumptions.
  • Team should identify key findings, quantify business value, and develop narrative to summarize and convey findings to stakeholders.
  • The team communicates benefits of project more broadly and sets up pilot project to deploy work in a controlled way before broadening the work to full enterprise of users.
  • This approach enables team to learn about performance and related constraints of the model in production environment on small scale and make adjustments before full deployment.
  • The team delivers final reports, briefings, and codes.
  • Free or open-source tools — Octave, WEKA, SQL, MADlib.

--

--

Aditya Dwivedi

MTech Scholar | Data Enginnering| -BITS Pilani’22 | UIT RGPV Alumni 2019 | ML Ops Practitioner