The Junior Data Science Reality: You Are a SQL Janitor

This article is written for the "Kaggle Grandmaster" wannabe.

You have spent the last 6 months living in Jupyter Notebooks. You know the mathematical difference between L1 and L2 regularization. You have fine-tuned a BERT model on a dataset you found on Reddit. You dream in PyTorch and Scikit-learn.

You believe that your first job will involve "Building Models", "Training AI", or "Solving AGI".

You believe you are entering the industry as a Scientist — a thinker who will be paid to experiment, hypothesize, and optimize.

If you think your daily life will resemble an DeepMind research paper, this article is your reality check.

5 min read · Reviewed by Editorial Desk · Correction path:
Last Reality Check: March 29, 2026

Key Takeaways

  • The Data Science Truth: The "Data Science" title covers a wide spectrum of work, most of which isn't machine learning.
  • Where Junior Data Scientists Get Stuck: The Analytics Trap: You're good at SQL and dashboards now.
  • Data Science Is Wrong For You If: You only want to build ML models: 70% of the job isn't that You hate SQL and data cleaning: That's most of the work You expect research-style wo

On This Page

The Expectation

The expectation is sold to you by EdTech influencers and Coursera certificates.

"Data is the new Oil."

You expect to walk into a company and be handed a perfectly clean, labeled dataset. You expect the Business Stakeholders to ask you for "Predictions" and "Insights".

You imagine your workflow like this:

  • Import Data
  • Train Model
  • Optimize Hyperparameters
  • Present cool 3D graphs to the CEO
  • Get promoted for increasing revenue by 20%

You think 80% of your time will be spent on Modelling and 20% on deployment.

You think SQL is "legacy tech" for backend engineers, and Excel is for finance guys.

The Reality

What Junior Data Scientists Actually Do:

📊 Data Science Job Reality

What Courses TeachWhat Juniors Actually Do
Machine learning algorithmsClean messy SQL data
Neural networksBuild dashboards
Statistical modelingAnswer ad-hoc data requests
Research papersExcel exports for business teams
Kaggle competitionsDebug data pipelines

The SQL Janitor Reality:

70-80% of junior data science work is:

  • Writing SQL queries to pull data
  • Cleaning data that's never clean
  • Building reports and dashboards
  • Answering "can you pull this data?" requests
  • Diagnosing why numbers don't match

The machine learning you studied? You'll use it on 5-10% of your tasks. And that's if you're lucky enough to have problems that need ML rather than simple analytics.

📈 Data Science Time Allocation

ActivityWhat You ExpectedReality (Junior Role)
Machine Learning50%5-10%
Data cleaning10%35%
SQL queries10%30%
Dashboards/reporting10%15%
Stakeholder requests5%10%
Meeting/communication5%10%

Case Study - The ML Dreamer:

Priya, 25, Junior Data Scientist at E-commerce Startup:

  • Masters in ML from good institute
  • Expectation: Building recommendation systems
  • Reality: "Can you pull last month's sales by category?"
  • ML projects worked on in 18 months: 1
  • SQL queries written: Hundreds
  • Dashboards built: 15+
  • Current feeling: "I'm a well-paid data analyst, not a data scientist"

Q1 2026 Reality Check

The SQL janitor dynamic has paradoxically intensified with LLM adoption. Companies that deployed AI-assisted BI tools (Tableau AI, Looker AI, PowerBI Copilot) now expect junior data professionals to produce more reports, faster, with the same headcount. The actual ML percentage of junior DS roles has not increased — if anything, companies are deferring custom ML projects in favor of buying foundation model API access for commodity tasks. What has increased is the expectation that junior data scientists can prompt-engineer outputs from LLMs while also handling traditional SQL/reporting work. Same salary band, doubled surface area.

Related context: Salary Reality Check, CTC Decoder, more in Data Science.

The data behind this article — in your inbox every Monday.

Salary benchmarks, layoff signals, and career reality checks for Indian tech professionals. Free. 12,000+ readers. No sponsors.

Unsubscribe any time.

Salary and Growth Reality

Data Role Salary Clarity:

💰 Detailed Data Role Comparison

RoleYear 2Year 5Year 8ML Work %
Business AnalystRs 8 LPARs 14 LPARs 22 LPA0%
Data AnalystRs 10 LPARs 18 LPARs 28 LPA0-5%
Data Scientist (typical)Rs 12 LPARs 24 LPARs 40 LPA10-25%
ML EngineerRs 15 LPARs 32 LPARs 55 LPA50-70%
Applied ScientistRs 18 LPARs 40 LPARs 70 LPA70-90%

If you want ML work AND high salary, target ML Engineer or Applied Scientist. "Data Scientist" at most companies is analytics with occasional modeling.

Where Real ML Work Exists:

  • Tech giants (Google AI, Meta FAIR, Amazon Science)
  • AI-first startups (core product is ML)
  • Research labs (slower, academic style)
  • Specialized teams at large companies

Most companies don't have enough data quality, infrastructure, or business problems for real ML. They hire "Data Scientists" and give them analyst work.

Cross-check your take-home with the CTC Decoder and compare ranges in Salary Reality.

Where Most People Get Stuck

Where Junior Data Scientists Get Stuck:

The Analytics Trap:

You're good at SQL and dashboards now. You're valuable for that work. Company doesn't want to train you on ML—they need the reports done. You become a specialist in precisely what you didn't want to do.

The Portfolio Gap:

Your Kaggle projects are from bootcamp. Your work projects are all internal dashboards. When you interview for "real" DS roles, you can't show ML production experience.

Escape Routes:

  1. Target ML Engineering: More engineering, less ambiguity. The work is what it claims to be.
  2. Join AI-First Companies: Startups where ML is the product, not a nice-to-have.
  3. Build Open Source/Side Projects: Create ML portfolio outside of work. Prove you can do the interesting stuff.
  4. Research Roles: Academic or industry research labs. Lower pay, real ML work.
  5. Specialize in DS Infrastructure: MLOps, feature stores, model serving. Less glamorous, more real demand.

If this matches your current situation, run the Resignation Risk Analyzer before making your next move.

Who Should Avoid This Path

Data Science Is Wrong For You If:

  • You only want to build ML models: 70% of the job isn't that
  • You hate SQL and data cleaning: That's most of the work
  • You expect research-style work: Production constraints rule
  • You joined because of course hype: Reality doesn't match marketing
  • You want clear deliverables: DS projects are often ambiguous and fail

The Data Role Clarification:

📊 What Each Data Role Actually Does

TitleRealityML PortionSalary Trajectory
Data AnalystSQL + dashboards + reporting0-5%Rs 8-28 LPA
Data ScientistSQL + analysis + occasional ML10-30%Rs 12-50 LPA
ML EngineerBuilding + deploying models50-70%Rs 15-65 LPA
Data EngineerPipelines + infrastructure5-10%Rs 12-50 LPA

If you want ML, target ML Engineering. If you're okay with analytics + occasional ML, Data Scientist works. If you want pure analytics, save yourself the ML courses and own the Data Analyst identity.

Decision Framework

Use this quick framework before changing role, company, or specialization.

  • If your take-home is not compounding with experience, benchmark externally — do not accept internal narratives.
  • If role expectations rise without title or pay movement, escalate with documented outcomes.
  • If your growth path is unclear beyond 6–9 months, run a switch-or-specialize decision cycle now.
  • Watch for this pattern from this article: Where Junior Data Scientists Get Stuck: The Analytics Trap: You're good at SQL and dashboards now.

Common Mistakes Checklist

  • Treating outlier salaries as planning baselines.
  • Using title changes as a substitute for genuine capability growth.
  • Delaying market benchmarking until after compensation has already stagnated.
  • Over-indexing on model demos without production deployment depth.

Real Scenario Snapshot

This article is written for the "Kaggle Grandmaster" wannabe. You're good at SQL and dashboards now.

Originality Lens

Contrarian thesis: The "Data Science" title covers a wide spectrum of work, most of which isn't machine learning.

Non-obvious signal: You're good at SQL and dashboards now.

Evidence By Section

Claim: Popular narratives about data science roles in India overweight outlier outcomes and underweight base-rate career trajectories.

Evidence: AmbitionBox Salary Insights, Glassdoor India Salaries

Claim: Observed compensation and growth outcomes for data science professionals diverge significantly from social-media storytelling.

Evidence: Glassdoor India Salaries, LinkedIn Jobs (India)

Claim: Data Science salary ranges in India vary materially by company type, negotiation leverage, and market cycle timing.

Evidence: AmbitionBox Salary Insights, Glassdoor India Salaries, LinkedIn Jobs (India), Naukri Jobs (India)

Claim: Professionals in data science plateau fastest when scope quality stagnates while responsibility and expectations keep rising.

Evidence: LinkedIn Jobs (India), Naukri Jobs (India), Kaggle State of Data/AI

Frequently Asked Questions

What is the reality of junior data science reality in India?
Writing SQL queries to pull data
Cleaning data that's never clean
Building reports and dashboards
Answering "can you pull this data?" requests
Diagnosing why numbers don't match
What salary can data science professionals realistically earn in India?
If you want ML work AND high salary, target ML Engineer or Applied Scientist. "Data Scientist" at most companies is analytics with occasional modeling.
Who should avoid junior data science reality in India?
If you want ML, target ML Engineering. If you're okay with analytics + occasional ML, Data Scientist works. If you want pure analytics, save yourself the ML courses and own the Data Analyst identity.
What is the final verdict on junior data science reality for Indian professionals?
The "Data Science" title covers a wide spectrum of work, most of which isn't machine learning. If you joined expecting research and models, you'll find SQL and dashboards. The mismatch causes disillusionment, but it's not the field's fault—it's expectations vs. reality.

Final Verdict

The Data Science Truth:

The "Data Science" title covers a wide spectrum of work, most of which isn't machine learning. If you joined expecting research and models, you'll find SQL and dashboards. The mismatch causes disillusionment, but it's not the field's fault—it's expectations vs. reality.

The Uncomfortable Question:

How much of your current role is actual ML vs. data manipulation? If it's 80%+ data work, you're a Data Analyst with an inflated title. Accept that, or actively seek ML Engineering roles at ML-first companies.

What Actually Works:

  1. Set realistic expectations—analytics first, ML maybe later
  2. Target companies where ML is the product, not a nice-to-have
  3. Consider ML Engineering if you want to build models professionally
  4. Build independent ML portfolio if your job doesn't provide ML opportunities
  5. Embrace the Data Analyst role if that matches your actual work
  6. Specialize in ML infrastructure (MLOps) for better positioning
Share Reality Check
Last Updated: January 13, 2026
Found a factual error? Request a correction.

What Changed

  • January 13, 2026: Updated data science salary ranges for 2026, refreshed market positioning benchmarks, and corrected stale compensation data against current hiring signals.
  • March 29, 2026: Fact-checked core claims against AmbitionBox, Glassdoor India, and LinkedIn hiring data. Corrected stale salary figures and re-validated growth projections.
  • December 23, 2025: Initial publication of this data science career reality check with market framing, salary benchmarks, and trade-off analysis for Indian professionals.

Sources