complete content on data analyst roadmap (#6748)

* complete content on data analyst roadmap

* Apply suggestions from code review

reverted changed node dimensions
pull/6768/head
dsh 2 months ago committed by GitHub
parent 2d14deb166
commit d87ac9bbba
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 1
      src/data/roadmaps/data-analyst/content/analaysis--reporting-with-excel@sgXIjVTbwdwdYoaxN3XBM.md
  2. 7
      src/data/roadmaps/data-analyst/content/average@yn1sstYMO9du3rpfQqNs9.md
  3. 1
      src/data/roadmaps/data-analyst/content/correlation-analysis@murioZ0NdrTix_lqSGz-8.md
  4. 7
      src/data/roadmaps/data-analyst/content/data-storage-solutions@iTmtpXe7dR4XKslgpsk2q.md
  5. 1
      src/data/roadmaps/data-analyst/content/data-transformation@t_BRtEharsrOZxoyX0OzV.md
  6. 3
      src/data/roadmaps/data-analyst/content/datedif@yBlJrNo9eO470dLp6OaQZ.md
  7. 3
      src/data/roadmaps/data-analyst/content/dispersion@2ldO-_ZnIg364Eo8Jyfgr.md
  8. 2
      src/data/roadmaps/data-analyst/content/ggplot2@E0hIgQEeZlEidr4HtUFrL.md
  9. 7
      src/data/roadmaps/data-analyst/content/ggplot2@n3M49lgNPn28hm7kzki-a.md
  10. 4
      src/data/roadmaps/data-analyst/content/introduction@3xp2fogAVmwXQhdzhZDWR.md
  11. 2
      src/data/roadmaps/data-analyst/content/key-concepts-of-data@R12sArWVpbIs_PHxBqVaR.md
  12. 2
      src/data/roadmaps/data-analyst/content/knn@SStzU_iXSvI_9QWbvGNou.md
  13. 1
      src/data/roadmaps/data-analyst/content/kurtosis@PqGO8AU1zE2ZdtqrIrOkZ.md
  14. 3
      src/data/roadmaps/data-analyst/content/learn-sql@i4VCwFm-wc9cqE73i-BIb.md
  15. 7
      src/data/roadmaps/data-analyst/content/matplotlib@tvDdXwaRPsUSTqJGaLS3P.md
  16. 4
      src/data/roadmaps/data-analyst/content/variance@ict4JkoVM-AzPbp9bDztg.md
  17. 2
      src/data/roadmaps/data-analyst/content/what-is-data-analytics@yCnn-NfSxIybUQ2iTuUGq.md
  18. 90
      src/data/roadmaps/data-analyst/data-analyst.json

@ -3,3 +3,4 @@
Excel is a powerful tool utilized by data analysts worldwide to store, manipulate, and analyze data. It offers a vast array of features such as pivot tables, graphs and a powerful suite of formulas and functions to help sift through large sets of data. A data analyst uses Excel to perform a wide range of tasks, from simple data entry and cleaning, to more complex statistical analysis and predictive modeling. Proficiency in Excel is often a key requirement for a data analyst, as its versatility and ubiquity make it an indispensable tool in the field of data analysis.
- [@article@W3Schools - Excel](https://www.w3schools.com/excel/index.php)
- [@course@Microsoft Excel Course](https://support.microsoft.com/en-us/office/excel-video-training-9bc05390-e94c-46af-a5b3-d7c22f6990bb)

@ -1,3 +1,8 @@
# Average
When focusing on data analysis, understanding key statistical concepts is crucial. Amongst these, central tendency is a foundational element. Central Tendency refers to the measure that determines the center of a distribution. The average is a commonly used statistical tool by which data analysts discern trends and patterns. As one of the most recognized forms of central tendency, figuring out the "average" involves summing all values in a data set and dividing by the number of values. This provides analysts with a 'typical' value, around which the remaining data tends to cluster, facilitating better decision-making based on existing data.
When focusing on data analysis, understanding key statistical concepts is crucial. Amongst these, central tendency is a foundational element. Central Tendency refers to the measure that determines the center of a distribution. The average is a commonly used statistical tool by which data analysts discern trends and patterns. As one of the most recognized forms of central tendency, figuring out the "average" involves summing all values in a data set and dividing by the number of values. This provides analysts with a 'typical' value, around which the remaining data tends to cluster, facilitating better decision-making based on existing data.
Learn more from the following resources:
- [@article@How to calculate the average](https://support.microsoft.com/en-gb/office/calculate-the-average-of-a-group-of-numbers-e158ef61-421c-4839-8290-34d7b1e68283#:~:text=Average%20This%20is%20the%20arithmetic,by%206%2C%20which%20is%205.)
- [@article@Average Formula](https://www.cuemath.com/average-formula/)

@ -5,3 +5,4 @@ Correlation Analysis is a quantitative method that data analysts widely employ t
Visit the following resources to learn more:
- [@article@Correlation](https://www.mathsisfun.com/data/correlation.html)
- [@article@What is correlation analysis?](https://blog.flexmr.net/correlation-analysis-definition-exploration)

@ -1,3 +1,8 @@
# Data Storage Solutions
As a business enterprise expands, so does its data. For data analysts, the surge in information means they need efficient and scalable data storage solutions to manage vast volumes of structured and unstructured data, collectively referred to as Big Data. Big Data storage solutions are critical in preserving the integrity of data while also providing quick and easy access to the data when needed. These solutions use software and hardware components to securely store massive amounts of information across numerous servers, allowing data analysts to perform robust data extraction, data processing and complex data analyses. There are several options, from the traditional Relational Database Management Systems (RDBMS) to the more recent NoSQL databases, Hadoop ecosystems, and Cloud storage solutions, each offering unique capabilities and benefits to cater for different big data needs.
As a business enterprise expands, so does its data. For data analysts, the surge in information means they need efficient and scalable data storage solutions to manage vast volumes of structured and unstructured data, collectively referred to as Big Data. Big Data storage solutions are critical in preserving the integrity of data while also providing quick and easy access to the data when needed. These solutions use software and hardware components to securely store massive amounts of information across numerous servers, allowing data analysts to perform robust data extraction, data processing and complex data analyses. There are several options, from the traditional Relational Database Management Systems (RDBMS) to the more recent NoSQL databases, Hadoop ecosystems, and Cloud storage solutions, each offering unique capabilities and benefits to cater for different big data needs.
Learn more from the following resources:
- [@official@SQL Roadmap](https://roadmap.sh/sql)
- [@official@PostgreSQL Roadmap](https://roadmap.sh/postgresql-dba)

@ -2,4 +2,5 @@
Data Transformation, also known as Data Wrangling, is an essential part of a Data Analyst's role. This process involves the conversion of data from a raw format into another format to make it more appropriate and valuable for a variety of downstream purposes such as analytics. Data Analysts transform data to make the data more suitable for analysis, ensure accuracy, and to improve data quality. The right transformation techniques can give the data a structure, multiply its value, and enhance the accuracy of the analytics performed by serving meaningful results.
- [@article@What is data transformation?](https://www.qlik.com/us/data-management/data-transformation)
- [@feed@Explore top posts about Data Analysis](https://app.daily.dev/tags/data-analysis?ref=roadmapsh)

@ -6,4 +6,5 @@ The `DATEDIF` function is an incredibly valuable tool for a Data Analyst in Exce
Learn more from the following resources:
- [@article@DATEDIF function](https://support.microsoft.com/en-gb/office/datedif-function-25dba1a4-2812-480b-84dd-8b32a451b35c)
- [@article@DATEDIF function](https://support.microsoft.com/en-gb/office/datedif-function-25dba1a4-2812-480b-84dd-8b32a451b35c)
- [@article@How to use DATEDIF in Excel](https://www.excel-easy.com/examples/datedif.html)

@ -4,4 +4,5 @@ Dispersion in descriptive analysis, specifically for a data analyst, offers a cr
Visit the following resources to learn more:
- [@article@Standard Deviation and Variance](https://www.mathsisfun.com/data/standard-deviation.html)
- [@article@What is dispersion?](https://www.investopedia.com/terms/d/dispersion.asp)
- [@video@Statistics 101 - Measures of Dispersion](https://www.youtube.com/watch?v=goXdWMZxlqM)

@ -4,5 +4,5 @@ When it comes to data visualization in R programming, ggplot2 stands tall as one
Learn more from the following resources:
- [@article@ggplot2 website]
- [@article@ggplot2 website](https://ggplot2.tidyverse.org/)
- [@video@Make beautiful graphs in R](https://www.youtube.com/watch?v=qnw1xDnt_Ec)

@ -1,3 +1,8 @@
# Data Visualization with ggplot2
ggplot2 is an important and powerful tool in the data analyst's toolkit, especially for visualizing and understanding complex datasets. Built within the R programming language, it provides a flexible, cohesive environment for creating graphs. The main strength of ggplot2 lies in its ability to produce sophisticated and tailored visualizations. This allows data analysts to communicate data-driven findings in an efficient and effective manner, enabling clear communication to stakeholders about relevant insights and patterns identified within the data.
ggplot2 is an important and powerful tool in the data analyst's toolkit, especially for visualizing and understanding complex datasets. Built within the R programming language, it provides a flexible, cohesive environment for creating graphs. The main strength of ggplot2 lies in its ability to produce sophisticated and tailored visualizations. This allows data analysts to communicate data-driven findings in an efficient and effective manner, enabling clear communication to stakeholders about relevant insights and patterns identified within the data.
Learn more from the following resources:
- [@article@ggplot2 website](https://ggplot2.tidyverse.org/)
- [@video@Make beautiful graphs in R](https://www.youtube.com/watch?v=qnw1xDnt_Ec)

@ -1,5 +1,3 @@
# Introduction to Data Analysis
Data Analysis plays a crucial role in today's data-centric world. It involves the practice of inspecting, cleansing, transforming, and modeling data to extract valuable insights for decision-making. A **Data Analyst** is a professional primarily tasked with collecting, processing, and performing statistical analysis on large datasets. They discover how data can be used to answer questions and solve problems. With the rapid expansion of data in modern firms, the role of a data analyst has been evolving greatly, making them a significant asset in business strategy and decision-making processes.
Learn more from the following resources:
Data Analysis plays a crucial role in today's data-centric world. It involves the practice of inspecting, cleansing, transforming, and modeling data to extract valuable insights for decision-making. A **Data Analyst** is a professional primarily tasked with collecting, processing, and performing statistical analysis on large datasets. They discover how data can be used to answer questions and solve problems. With the rapid expansion of data in modern firms, the role of a data analyst has been evolving greatly, making them a significant asset in business strategy and decision-making processes.

@ -1,3 +1,3 @@
# Introduction to Key Concepts for Data
# Introduction to Key Concepts for Data Analysts
In the realm of data analysis, understanding some key concepts is essential. Data analysis is the process of inspecting, cleansing, transforming, and modeling data to discover useful information and support decision-making. In the broadest sense, data can be classified into various types like nominal, ordinal, interval and ratio, each with a specific role and analysis technique. Higher-dimensional data types like time-series, panel data, and multi-dimensional arrays are also critical. On the other hand, data quality and data management are key concepts to ensure clean and reliable datasets. With an understanding of these fundamental concepts, a data analyst can transform raw data into meaningful insights.

@ -4,5 +4,5 @@ K-Nearest Neighbors (KNN) is a simple yet powerful algorithm used in the field o
Learn more from the following resources:
- [@article@What is the k-nearest neighbors (KNN) algorithm?](https://www.ibm.com/topics/knn#:~:text=The%20k%2Dnearest%20neighbors%20(KNN,used%20in%20machine%20learning%20today.)
- [@article@What is the k-nearest neighbors (KNN) algorithm?](https://www.ibm.com/topics/knn#:~:text=The%20k%2Dnearest%20neighbors%20KNN,used%20in%20machine%20learning%20today.)
- [@article@Nearest Neighbors](https://scikit-learn.org/stable/modules/neighbors.html)

@ -5,3 +5,4 @@ Understanding distribution shapes is an integral part of a Data Analyst's daily
Visit the following resources to learn more:
- [@article@Kurtosis: Definition, Types, and Importance](https://www.investopedia.com/terms/k/kurtosis.asp)
- [@video@What is Kurtosis?](https://www.youtube.com/watch?v=AsxEDBhESJg)

@ -1,3 +0,0 @@
# SQL for Data Analysts
Structured Query Language, or SQL, is an essential tool for every data analyst. As a domain-specific language used in programming and designed for managing data held in relational database management systems, SQL allows analysts to manipulate and analyse large volumes of data efficiently. Understanding SQL allows a data analyst to extract insights from data stored in databases, conduct complex queries, and create elaborate data reports. SQL is recognized for its effectiveness in data manipulation and its compatibility with other coding languages, making it a fundamental competency in the data analytics field.

@ -1,3 +1,8 @@
# Matplotlib
For a Data Analyst, understanding data and being able to represent it in a visually insightful form is a crucial part of effective decision-making in any organization. Matplotlib, a plotting library for the Python programming language, is an extremely useful tool for this purpose. It presents a versatile framework for generating line plots, scatter plots, histogram, bar charts and much more in a very straightforward manner. This library also allows for comprehensive customizations, offering a high level of control over the look and feel of the graphics it produces, which ultimately enhances the quality of data interpretation and communication.
For a Data Analyst, understanding data and being able to represent it in a visually insightful form is a crucial part of effective decision-making in any organization. Matplotlib, a plotting library for the Python programming language, is an extremely useful tool for this purpose. It presents a versatile framework for generating line plots, scatter plots, histogram, bar charts and much more in a very straightforward manner. This library also allows for comprehensive customizations, offering a high level of control over the look and feel of the graphics it produces, which ultimately enhances the quality of data interpretation and communication.
Learn more from the following resources:
- [@video@Learn Matplotlib in 6 minutes](https://www.youtube.com/watch?v=nzKy9GY12yo)
- [@article@Matplotlib Website](https://matplotlib.org/)

@ -4,5 +4,5 @@ Data analysts heavily rely on statistical concepts to analyze and interpret data
Learn more from the following resources:
- [@article@](https://www.investopedia.com/terms/v/variance.asp)
- [@article@How to calculate variance](https://www.scribbr.co.uk/stats/variance-meaning/
- [@article@What is variance?](https://www.investopedia.com/terms/v/variance.asp)
- [@article@How to calculate variance](https://www.scribbr.co.uk/stats/variance-meaning/)

@ -1,3 +1,3 @@
# Introduction to Data Analytics for Data Analysts
# Introduction to Data Analytics
Data Analytics is a core component of a Data Analyst's role. The field involves extracting meaningful insights from raw data to drive decision-making processes. It includes a wide range of techniques and disciplines ranging from the simple data compilation to advanced algorithms and statistical analysis. As a data analyst, you are expected to understand and interpret complex digital data, such as the usage statistics of a website, the sales figures of a company, or client engagement over social media, etc. This knowledge enables data analysts to support businesses in identifying trends, making informed decisions, predicting potential outcomes - hence playing a crucial role in shaping business strategies.

@ -1696,39 +1696,6 @@
"focusable": true,
"selectable": true
},
{
"width": 142,
"height": 49,
"id": "i4VCwFm-wc9cqE73i-BIb",
"type": "topic",
"position": {
"x": -326.321810236714,
"y": 540.1000000000001
},
"selected": false,
"data": {
"label": "Learn SQL",
"style": {
"fontSize": 17,
"justifyContent": "flex-start",
"textAlign": "center"
},
"oldId": "SiYUdtYMDImRPmV2_XPkH"
},
"zIndex": 999,
"style": {
"width": 142,
"height": 49
},
"resizing": false,
"positionAbsolute": {
"x": -326.321810236714,
"y": 540.1000000000001
},
"dragging": false,
"focusable": true,
"selectable": true
},
{
"width": 279,
"height": 49,
@ -5545,6 +5512,32 @@
"y": 369.3426822070478
},
"dragging": false
},
{
"id": "BnvNJCHnPsTo25Hn0dN9v",
"type": "button",
"position": {
"x": -321.643620473428,
"y": 543.1000000000001
},
"selected": false,
"data": {
"label": "Learn SQL",
"href": "https://roadmap.sh/sql",
"color": "#ffffff",
"backgroundColor": "#2a79e4",
"style": {
"fontSize": 17
}
},
"zIndex": 999,
"width": 128,
"height": 49,
"positionAbsolute": {
"x": -321.643620473428,
"y": 543.1000000000001
},
"dragging": false
}
],
"edges": [
@ -6404,37 +6397,36 @@
},
{
"style": {
"strokeDasharray": "0",
"strokeDasharray": "0.8 8",
"strokeLinecap": "round",
"strokeWidth": 3.5,
"stroke": "#2b78e4"
},
"source": "i4VCwFm-wc9cqE73i-BIb",
"sourceHandle": "z2",
"target": "YDomoxzf-65sru6XVu0_X",
"targetHandle": "y1",
"source": "TeewVruErSsD4VLXcaDxp",
"sourceHandle": "y2",
"target": "lTycWscyFPi-BtkNg9cdm",
"targetHandle": "z1",
"data": {
"edgeStyle": "solid"
"edgeStyle": "dashed"
},
"id": "reactflow__edge-i4VCwFm-wc9cqE73i-BIbz2-YDomoxzf-65sru6XVu0_Xy1",
"selected": false,
"type": "simplebezier"
"id": "reactflow__edge-TeewVruErSsD4VLXcaDxpy2-lTycWscyFPi-BtkNg9cdmz1",
"selected": false
},
{
"style": {
"strokeDasharray": "0.8 8",
"strokeDasharray": "0",
"strokeLinecap": "round",
"strokeWidth": 3.5,
"stroke": "#2b78e4"
},
"source": "TeewVruErSsD4VLXcaDxp",
"sourceHandle": "y2",
"target": "lTycWscyFPi-BtkNg9cdm",
"targetHandle": "z1",
"source": "BnvNJCHnPsTo25Hn0dN9v",
"sourceHandle": "z2",
"target": "YDomoxzf-65sru6XVu0_X",
"targetHandle": "y1",
"data": {
"edgeStyle": "dashed"
"edgeStyle": "solid"
},
"id": "reactflow__edge-TeewVruErSsD4VLXcaDxpy2-lTycWscyFPi-BtkNg9cdmz1",
"id": "reactflow__edge-BnvNJCHnPsTo25Hn0dN9vz2-YDomoxzf-65sru6XVu0_Xy1",
"selected": false
}
]

Loading…
Cancel
Save