Behind the Curve: The Slow March of AI into Classrooms.

I will start with the obvious disclaimer that I work in the domain, and hence maybe I have noticed the trends I have mentioned a lot more, but that also gives me a broader insight. Its a double-edged sword because its an evident bias, and so I welcome any counter points.

With the explosion of generative AI, and the current in-fashion methods of using high quality synthetically generated data, we see many folks talking about the Cambrian Explosion of AI. It has enabled AI to permeate not just the public consciousness but also every single industry one can imagine. Even traditionally not tech-heavy domains such as real estate seem to be incorporating it in thanks to the evangelisation of the field and the ease of integrating them. The Return-on-Investment has increased quite a bit specially with the advent of large language models. But there is one industry which is still in the “testing the waters” phase. Education. One recent tweet provides more concrete numbers.

The second graph on AI transactions really puts the nail in the coffin. There is less use of AI in Government than in Education. The industry that is the slowest adopter of new technology has been beaten. But how did we end up here. For that lets first go back to 2015.

“Big” Personalised Education

The middle of the 2010s was when online education became a thing. Coursera had a lot of momentum, and MOOCs were all the rage. In this new era of education, there was another fashionable factor: Big Data. All the hype aside, there was real value in personalising content based on algorithms crunching out predictions from large, rich datasets. The natural expectation was that with the advent of online education, collecting data would become easier and hence, so would personalisation.

In 2015, the Chan Zuckerberg Initiative (CZI) funded a new project around personalising education for students at scale [1]. Their goal was to deliver hyper-personalised content based on how they did on tests. Around $100 million was put into it. The very next year, the Gates foundation funded a similar project, and again in 2018 Sal Khan echoed the same sentiments [1]. So how did these projects go? Did we solve personalised education? The answer is in the announcement from the CZI project in late 2023. They are shutting it down after not seeing good enough results.

While personalised education has not reached its potential yet, various other innovations such as cohort-based courses, and gamification have been spearheading the online education sector [2, 3]. Not only does this direct the attention of builders towards these implementations, it also takes them away from applying more of the personalisation. That, in turn, has more downstream effects.

Operationalising

Here’s a small exercise you can do if you want. Pick a term from the domain of Educational Data Mining (EDM) such as say “knowledge tracing”. Now go to GitHub and search for it. If you’re feeling lazy, here you go. Apart from 1 or 2, most repositories are unmaintained and stale. EdNet which is one of the landmark datasets of the field has not been updated in 4 years. One of the libraries we, at kaksha, were using internally, stopped working a few months back because of breaking changes in its dependencies that were a result of the maintainers not caring enough. I am not singling them out but just pointing out the general landscape of EDM, and how its applied.

Without the infrastructure of well-maintained libraries, documentation, and communities having discourse about it, the possibility of turning it from a purely research focused domain into one that is more applied seems farfetched. Add to that there is no standardisation among libraries. There might be multiple knowledge tracing libraries, each having their own format of handling data. Compare that to data science where everyone supports numpy arrays and pandas dataframes. Sure, at this point it might seem more like a rant from personal experience but I have spoken to a few other EDM researchers who have echoed similar sentiments.

Adding insult to injury of standardisation is the lack of reproducibility of results. While it is more a general problem in machine learning and broadly data science [4], EDM research has its own set of problems. Firstly, there’s the issue of sub-domains and cohort demographics. The way adults learn is different to adolescents which is different from early childhood learners. College education is completely different compared to school education. Problems like these ensure that a lot of research just does not generalise and ends up being an interesting insight that can be applied to a situation, instead of a general solution. And secondly, the number of confounding variables are more challenging to identify. While all research has the problem of such variables, it is a little more harder in educational research to pin them down and quantify them [5]. There are more of the intangibles such as the effect of a teacher’s presence, and the partial learning from interacting with peers off-platform (i.e. informally). Further, even if these issues of operationalising research were to be mitigated there are still pressing challenges.

Public perception

AI takes a lot of capital investment. All of the data management, labelling, model training, deployment etc. takes effort and money. VCs are investing millions in AI startups around the world, but there is one sector which is lagging behind. There was a 49% drop in investments from 2021 to 2022 in funding ed-tech startups [6], and while the numbers may have flattened out right now, thats still not great news. Add to that this one graphic.

Graph In general, the education industry is starved of capital. But why is that? There’s 2 reasons, one more short-term, and one long-term.

The short-term reason is simple. The pandemic provided a boom of online education. There was a forced move towards it, which is now being rolled back slightly. Learning quality was affected by this sudden change in delivery [7]. As more and more people go back to traditional learning settings, the investments seem to have returned to the normal rather than go on a downward trend. Add to that the perception of EdTech startups taking a hit due to examples such as Byju’s, who raised billions only to make no discernible difference to the quality of education provided to its students. So investments are slower, but companies working on the real problems of education are still being supported.

The long-term reason is quite obviously more complex. The general perception of modern society is that “education should be free”. I would change that slightly to “knowledge should be free” because education is delivered by people, and those people should be fairly compensated. Look at teachers in the US, they are one of the worst paid professionals in the country by a mile [8], and the situation is not that different in other geographies either. This perception permeates the public consciousness and ends up hurting educators who can make learning so much better if only supported the right way.

What Next?

To recap, there was a lot of initial hype of personalisation of education using AI, but initial results failed. Add to that bad press about other edtech startups, and the pandemic investment bubble bursting, investment in it slowed down. On the technical side of things, the lack of enough motivated individuals and standard infrastructure to run large-scale experiments on creates a vicious cycle which makes applying EDM research harder.

So is it all bad? Not at all. The existence of companies like ours is proof of that. Despite the challenges there still are motivated individuals such as the ones who are maintaining libraries and making large-scale experimentation of educational data easier (ASSISTments). There are still the Baker’s Learning Analytics Prizes to be won. Khan Academy and Duolingo are still running learning science experiments at scale. And every year amazing new tools come out of the Learning Tools Competition. With this slower, more sustainable growth of intellectual capital in the field, the financial capital will soon follow, and hopefully the slow march of AI into the classrooms may just be a triumphant one.

References:

  1. Dan Meyer, The Misunderstanding About Education That Cost Mark Zuckerberg $100 Million. https://danmeyer.substack.com/p/the-misunderstanding-about-education
  2. Tiago Forte, The Future of Education is Community: The Rise of Cohort-Based Courses. https://fortelabs.com/blog/the-rise-of-cohort-based-courses/
  3. Amina Khaldi, Rokia Bouzidi & Fahima Nader, Gamification of e-learning in higher education: a systematic literature review. https://slejournal.springeropen.com/articles/10.1186/s40561-023-00227-z
  4. Harald Semmelrock et. al, Reproducibility in Machine Learning-Driven Research. https://arxiv.org/abs/2307.10320
  5. Confound It! Or, Why It’s Important Not To. https://www.qualitymatters.org/qa-resources/resource-center/articles-resources/confounding-variables-in-research
  6. Michelle Caffrey, Venture Capital Investment in Global Ed Tech Plummets, Returning to Pre-Pandemic Levels. https://marketbrief.edweek.org/marketplace-k-12/venture-capital-investments-global-ed-tech-plummet-returning-pre-pandemic-levels/
  7. Irwan, D. (2021). Online learning implementation. Acitya Journal of Teaching & Education, 3(2), 280-293. https://doi.org/10.30650/ajte.v3i2.2258
  8. Madelein Will. The Gap Between Teacher Pay and Other Professions Hits a New High. How Bad Is It?. https://www.edweek.org/teaching-learning/the-gap-between-teacher-pay-and-other-professions-hits-a-new-high-how-bad-is-it/2022/08