Please enable JavaScript to use CodeHS

Standards Mapping

for Arkansas Data Science

78

Standards in this Framework

19

Standards Mapped

24%

Mapped to Course

Standard Lessons
1.1.1
Identify the key stages of a data science project lifecycle.
  1. 1.2 The Data Science Life Cycle
1.1.2
Identify key roles and their responsibilities in a data science team (e.g., business stakeholders, define objectives; data engineers, build pipelines; data scientist, develop models; domain experts, provide expertise).
  1. 1.1 What is Data Science?
1.1.3
Define and create project goals and deliverables (e.g., problem statements, success metrics, expected outcomes, final reports, summary presentations).
  1. 4.2 Final Project
1.1.4
Create and manage project timelines (e.g., milestones, deadlines, task dependencies, resources allocation).
1.1.5
Create a student portfolio including completed data science projects, reports, and other student-driven accomplishments.
1.2.1
Collaborate in team-based projects (e.g., team discussions, maintaining project logs, following protocols, code review, documentation).
1.2.2
Communicate technical findings to non-technical audiences (e.g., data visualizations, present key-insights, explaining complex concepts).
  1. 2.5 Pivot Tables
  2. 3.3 Responsible Data Science
  3. 4.2 Final Project
1.2.3
Make data-driven decisions and recommendations by proposing solutions and evaluating alternatives.
1.3.1
Identify ethical considerations in data collection, storage and usage (e.g., data privacy, bias, transparency, consent).
  1. 3.2 Big Data and Bias
1.3.2
Demonstrate responsible data handling practices (e.g., protecting sensitive information, citing data sources, maintaining data integrity).
  1. 3.1 Data Privacy
  2. 3.3 Responsible Data Science
1.3.3
Report results responsibly (e.g., addressing limitations, acknowledging uncertainties, prevent misinterpretation).
  1. 3.1 Data Privacy
  2. 3.3 Responsible Data Science
2.1.1
Differentiate between discrete and continuous probability distributions.
2.1.2
Calculate probabilities using discrete distributions (e.g. Uniform, Binomial, Poisson).
2.1.3
Calculate probabilities using continuous distributions (e.g. Uniform, Normal, Student, Exponential).
2.1.4
Apply Bayes’ Theorem to calculate posterior probabilities.
2.2.1
Calculate p-values using a programming library and interpret the significance of the results.
2.2.2
Perform hypothesis testing.
2.2.3
Identify and explain Type I and Type II Errors (e.g., false-positives, false-negatives).
2.2.4
Calculate and interpret confidence intervals.
2.2.5
Design and analyze experiments to compare outcomes (e.g., identifying control/treatment groups, selecting sample sizes, determining variables, implementing A/B tests).
2.3.1
Perform basic matrix operations including addition, subtraction and scalar multiplication.
2.3.2
Calculate dot products and interpret their geometric meaning.
2.3.3
Apply matrix transformations to data sets.
2.3.4
Compute and interpret distances between vectors.
3.1.1
Create and manipulate (e.g., sort, filter, aggregate, reshape, merge, extract, clean, transform, subset) one-dimensional data structures for computation analysis (e.g lists, arrays, series).
3.1.2
Create and manipulate (e.g., transpose, join, slice, pivot, reshape) two-dimensional data structures for organizing structured datasets. (e.g. matrices, dataframes).
3.1.3
Utilize operations (e.g., arithmetic, aggregations, transformations) across data structures based on analytical needs.
  1. 2.6 Statistical Measures
3.1.4
Apply indexing methods to select and filter data based on position, labels, and conditions.
3.2.1
Import data into a DataFrame from common spreadsheets formats (e.g., csv, xlsx).
3.2.2
Import data into a DataFrame directly from a database (e.g., using SQLalchemy library).
3.2.3
Import data into a DataFrame using web scraping libraries (e.g. Beautiful Soup, Selenium).
3.2.4
Import data into a DataFrame leveraging API requests (e.g., Requests, urllib).
3.3.1
Convert between data types as needed for analysis (e.g., strings to numeric values, dates to timestamps, categorical to numeric encoding).
3.3.2
Convert between structures as needed for analysis (e.g., lists to arrays, arrays to data frames).
3.3.3
Standardize and clean text data (e.g., remove whitespace, correct typos, standardize formats).
  1. 2.2 Data Cleaning
3.3.4
Identify and remove duplicate or irrelevant rows/records.
  1. 2.2 Data Cleaning
3.3.5
Restructure columns/fields for analysis (e.g., splitting, combining, renaming, removing irrelevant data).
  1. 2.2 Data Cleaning
3.3.6
Apply masking operations to filter and select data.
  1. 2.3 Sort and Filter
3.3.7
Handle missing and invalid data values using appropriate methods (e.g., removal, imputation, interpolation).
  1. 2.2 Data Cleaning
3.3.8
Identify and handle outliers using statistical methods.
3.4.1
Examine data structures using preview and summary methods (e.g., head, info, shape, describe).
3.4.2
Create new data frames by merging or joining two data frames.
3.4.3
Sort and group records based on conditions and/or attributes.
  1. 2.5 Pivot Tables
3.4.4
Create functions to synthesize features from existing variables (e.g., mathematical operations, scaling, normalization).
4.1.1
Generate histograms and density plots to display data distributions.
  1. 2.4 Data Visualizations
4.1.2
Create box plots and violin plots to show data spread and quartiles.
4.1.3
Construct Q-Q plots to assess data normality.
4.2.1
Generate scatter plots and pair plots to show relationships between variables.
4.2.2
Generate correlation heatmaps to display feature relationships.
4.2.3
Plot decision boundaries to visualize data separations.
4.3.1
Generate bar charts and line plots to compare categorical data.
  1. 2.4 Data Visualizations
4.3.2
Create heat maps to display confusion matrices and tabular comparisons.
4.3.3
Plot ROC curves and precision-recall curves to evaluate classifications.
4.4.1
Generate line plots to show trends over time.
4.4.2
Create residual plots to analyze prediction errors.
4.4.3
Plot moving averages and trend lines.
4.5.1
Draw conclusions by interpreting statistical measures (e.g., p-values, confidence intervals, hypothesis test results).
4.5.2
Evaluate model performance using appropriate metrics and visualizations (e.g., R-squared, confusion matrix, residual plots).
4.5.3
Identify patterns, trends, and relationships in data visualizations (e.g., correlation strength, outliers, clusters).
  1. 1.1 What is Data Science?
4.5.4
Draw actionable insights from analysis results.
  1. 2.5 Pivot Tables
  2. 4.2 Final Project
5.1.1
Describe the key characteristics of Big Data (e.g., Volume, Velocity, Variety, Veracity).
5.1.2
Identify real-world applications of Big Data across industries (e.g., healthcare, finance, retail, social media).
  1. 3.2 Big Data and Bias
5.1.3
Analyze case studies of successful and unsuccessful Big Data implementations across industries (e.g., recommendation systems, fraud detection, predictive maintenance).
5.1.4
Identify common Big Data platforms and tools (e.g., Hadoop for distributed storage, Spark for data processing, Tableau for visualization, MongoDB for unstructured data).
5.2.1
Describe how organizations store structured and unstructured data.
5.2.2
Compare different types of data storage systems (e.g., data warehouse, data lakes, databases).
6.1.1
Contrast supervised and unsupervised learning.
6.1.2
Differentiate between classification and regression problems.
6.1.3
Evaluate model performance using appropriate metrics (e.g. Accuracy, Precision/Recall, Mean Squared Error, R-squared).
6.2.1
Perform linear regression for prediction problems.
6.2.2
Perform multiple regression for prediction problems.
6.2.3
Perform logistic regression for classification tasks.
6.2.4
Implement Naive Bayes Classification using probability concepts.
6.2.5
Perform k-means clustering using distance metrics.
6.3.1
Apply standard methods to split data into training and testing sets.
6.3.2
Apply cross-validation techniques (e.g. k-fold, leave-one-out, stratified k-fold).
6.3.3
Identify and address overfitting/underfitting.
6.3.4
Select appropriate models based on data characteristics and problem requirements.