The Ethics of Automating Education
We are hearing it all the time now, AI will be the game changer of the future. I often think about my students when researching the latest in immersive education and technology in general. I feel like I am doing a service by educating students and communities about what robot might be around the corner for us. I see so much potential in the future where technology frees us to reach higher in thought and aspiration. But I also have my fears and doubts, much because we are already in the 3rd inning of this social/economical paradigm shift and signs show we don’t have a pitcher in the bullpen for after the 7th inning stretch (can’t believe I used a baseball analogy, I wonder what’s the venn overlap of baseball fans to educational technologists?). In other words, speaking from a US perspective, we are not prepared for what is coming; not in terms of laws, nor protections, nor literacy for the generation that will feel the full brunt of this change. There is a paradox I am seeing everywhere, different from the well established paradox of automation — humans automate processes, issues arise, and we look to automate away those issues. We need to scale the solutions like we are scaling the issues around automation, and for that some form of automation is needed. Like trying to teach truck drivers to code because we autonomous trucks will take their jobs. That solution doesn’t scale along with the problem, and doesn’t fit or make sense in many cases anyway.
We need to scale the solutions like we are scaling the issues around automation, and for that some form of automation is needed.
Just to name some examples of this paradox, Mr. Zuck of FB thinks the answer to getting our news on his platform is perfecting the algorithm used to curate information for you. Amazon needs all your buying history to disrupt the bloated heath insurance market. The court system wants to streamline the penal system by predicting recidivism that needs more criminals and criminal data to be more accurate. This is very much related to the traditional view of the paradox of automation where more efficient the automated system, the more crucial the human contribution of the operators. Humans are less involved, but their involvement becomes more critical. And when issues arise with the automation, only a few people understand how to fix it.
So the more we advance in automating our lives, the more important a few key people, a few important decisions, a few core values will scale to increasingly massive effects. This I think is the main cause of my future anxiety, is that humanity tends to be more reactive than proactive as a whole, but when it comes to automation, the less proactive we are the more reactive we may have to be. Privacy legislation, net neutrality, the legality of closed or nontransparent AI algorithms, and the ethics of automating the education of the next generations are all things that will play out to large outcomes and it seems, in different countries/regions in different ways, things are not shaping up fast enough. I want to touch on all of these concepts, but I will do my best to focus on the last idea of automating education, because ultimately we will need to be informed well enough to navigate these changes that automation will cause.
Garbage in, Garbage out
Lets take FB as a first example, a company who seems can’t get much right these days and really feeling the ‘techlash’ of continuously obfuscating their responsibility of having so much power on the world. Think what you want about Facebook, but it has become the emblem of the automation of how our information gets curated, and how we are shifting from consumers of information to the raw materials used to create lucrative predictive models. We look at and interact with content, those actions are recorded and put to work predicting more what what we want to look at and interact with. This is lead to consumption loops or even compulsion loops when the social functions are added. The concept was great, we see more of the people and things we like, but the business model is not to keep you informed, but to keep you engaged, and by keeping you engaged keeps in line with the self-serving agenda of collecting more data to keep you engaged in the future.
This is ‘training’ the FB algorithm to exacerbate these consumption/compulsion loops, and the training/optimizing of these algorithms started with a decision and a point by small group of people in a boardroom. Now that important decision, by a few key people, is being amplified by a working system to staggering effect. Google search is optimized for information, Facebook feeds are optimized for engagement. When AI systems are optimized for relevant information, we get Wikipedia, and when AI systems are optimized for engagement, we get Alex Jones. I can not understate just how widespread this decision has and will continue to have on all of us. Everything from your insurance rates, your employment, and even your ability to get a loan is now tied to what you do on Facebook.
Excerpt from Forbes article — Life Insurers Can Use Social Media Posts To Determine Premiums, As Long As They Don’t Discriminate- Link Below
A New York court decision in 2010 (McCann v. Harleysville Insurance Co.) declared that an insurance company could not conduct “a fishing expedition” into someone’s Facebook account “based on the mere hope of finding relevant evidence,” but clearly insurers are finding workarounds. At the very least, it appears that the Fair Credit Reporting Act might give customers who are denied insurance the right know whether the decision was based on information gleaned from a social media profile. This could provide fodder for lawsuits that could clarify the boundaries for everyone.
This idea is as old as computer science. ‘Garbage in, garbage out’ — I remember in my AP comp sci class in high school, my teacher asked me an another student to create a program to input, store, track, and analyze the data of our baseball team. We were given a template data sheet that would be used to track the player data, like at bats, hits, runs, etc. We spent a couple of week writing code, using some pseudo data we made up to test our software. Then we were given the marked sheets to enter into our software, we couldn’t read it, it looked like chicken scratch to us, but we did the best we could. We made print outs of stats of all players and delivered them to the team. They seems appreciative to get the information, but thought it was all a bit strange and didn’t match what they themselves had been keeping count. We thought we need to defend our software, and showed some players the stat sheets we had to work with. The results were shrugs all around and a piece of software never to be used again. We didn’t know the term at the time, but this staple phenomenon in computer science, and in the scale of AI could be pretty devastating.
Life Insurers Can Use Social Media Posts To Determine Premiums, As Long As They Don’t Discriminate. New York’s Department of Financial Services (NYFS) has released new guidelines that will allow life insurance companies…www.forbes.com
Let’s now transfer this idea to education, and the idea of AI assessment. It may not surprise you to know that the majority of venture investment and funding for AI in education is from testing companies and those who wish to disrupt them. Integrating AI into the test assessment process is attractive to testing companies for many reasons. The big reason, despite the obvious reduction in labor costs, is having the ability to standardize assessment for seemingly subjective content like essays and long other form answers. This is where we run into the same issues of FB and its deployment of algorithmic answers to the news feed. The AI that scores your essay needs to be ‘trained’, many times it is done by feeding a deep learning system with tons of samples and scores. That process is started by making a few key decisions, like what samples to use and what scoring system to employ. Lets say that a tech company in London is working on making a new AI essay assessment platform. They feed they train a system with the top essays from top English universities, and find what they think are substandard essays as well. You have now potentially put a needed posh stamp of approval for the voice of which that essay is told at scales that go well beyond the English system. How the system was trained must be kept a secret, as they will tell you it is proprietary corporate information and protected under law, and they’d be correct. Companies have often refused to share details of their algorithms with customers and the law has allowed them to do so. This ability to keep the core parameters of training AI systems secret already became a huge problem in the US court systems using an AI to steer judges rulings by giving them recidivism scores to defendants.Sent to Prison by a Software Program’s Secret AlgorithmsSidebar When Chief Justice John G. Roberts Jr. visited Rensselaer Polytechnic Institute last month, he was asked a…www.nytimes.com
But the true problem here I think is again the business model. It makes much more business sense to keep your assessment AI training procedure and parameters a secret, so you can also create training materials to those who wish to score better on your test and not have people ‘game’ the test by knowing some key points of focus. So what we end up with is an assessment system that has the power to scale and effect students all over the world, but make hidden the details of how the scoring is done, and therefore blur company responsibility for any unseen undesirable social effects.
Do no harm
Now lets turn to Amazon. The model for which judge a monoploy that must be broken up in the US is based on this idea of public harm. Many economists and law makers argue that Amazon should be allowed to continue to gobble up its competitors is because Amazon is good for consumers. It’s more convenient, its cheaper, and in many ways it has superior service. Antitrust laws are meant to protect consumers, and not competitors. Others are suggesting we need a new model in this age of AI and automation as networks of scale empower the mega-huge companies with more data, which allows the collection of more data… and with the power of data run its competitors into the ground.
We have seen Amazon do this time and time again, from diapers in 2010 to microwaves in 2017. But I love Amazon in my life, nothing says my feeling much better than a recent episode of Patriot Act with Hasan Minhaj. I am getting the sense that Amazon IS doing the public harm, but not me. I get my podcasting equipment and deodorant shipped to my door cheaper and faster than going to the local store. Meanwhile the local store is hurting, which will have lasting effects on my community, we are just not sure to what extent. I feel that Uber, AirBnB, and other gig economy job creating companies are a symptom of Amazon and companies like it. Companies who much like Facebook sees its users as commodities, these gig job creator’s employees are becoming more like commodities.
So how does this tie into the future of education? Easy. Just like Amazon is in the virtuous (for other’s vicious) circle of networks of scale, data collection, and automation. A self reinforcing system to grow and grow and grow, until all products and services are Amazon. This same concept is happening in elearning, at least in some ways. It is much harder to see, as the education as a market is a very dysfunctional one from an economical standpoint, prices are going up, value is going down, demand is many times misrepresented, and so making the link between Amazon, and a would be Amazon of learning is still a bit… fuzzy to me. But there is enough disruption and a enough out there so far to indicate things are headed in the “Amazon” way.
First of all, “horay for elearning!”, is something I say to myself as an educational technologist almost on a daily basis. I love elearning in my life, its more convenient, its cheaper, and in many ways it has superior service than many face-to-face classroom environments (see where I’m going with this yet?). The major link here is scale-ability, much like Amazon, digitized learning content can reach audiences like no teacher or professor could ever dream to reach in past generations. So who can/will scale? We already have just a couple of huge players, and not surprising the universities with the biggest endowments are playing big in elearning. An ivy-league school is putting hollywood budgets into making elearning content and leveraging money and branding to be the place everyone wants to learn basic chemistry or biology online. Again, good for consumers/learners but potentially bad for competitors. I say potentially, because it depends on the type of learning/teaching you do.
The Open University started as a radical idea, now it’s in trouble. It was a regular flight from London to Moscow. Except in the mid-1970s, pilots still often walked down the aisle and…www.wired.co.uk
You can learn from the worlds leading chemist on basic chemistry, but we may still need a place to experiment, a coach to guide us, or a team to collaborate with. But make no mistake, these huge-online markets are benefiting from the same virtuous (for other’s vicious) circle in networks of scale, data collection, and automation that Amazon is. And while it still may be harder to see harm to the consumer and learning communities as a whole, I believe we will see the commodification of learning content get funneled to an apex, with the very knowledge that is needed to create new AI systems and networks to rival huge providers set at a premium price. This is already the case, with online AI courses at MIT or Harvard free for intro level but for the latest and most current courses they can be some of the most expensive courses on the internet. And having the large data sets lets them know which courses can be priced at what, same issue of business models at work.
As a researcher in augmented and virtual realities in learning contexts, I can see all of these issues coming together and potentially exacerbating each other in these new mediums of content delivery. I laid out some of these concerns is a talk recently at SXSWedu in Austin, TX.
Augmented and virtual realities are close cousins of AI and will have a big role in automating our learning in this new paradigm, this new world of AI, new world of emphatic computing. Because I claim that AR will be the medium of choice in an AI hyper-curated, hyper-contextualized, just-in-time information and learning culture. This medium and the use of it will exacerbate the network effects and data mining abilities of the mega-huge platforms. As we move more to input methods like voice and touch, more data is collected, but less is given. Take Google search, input a search string using a keyboard on your laptop and you get a ton of search results back on your monitor, ask your Google home the same query, which has been trained to listen and understand your voice over time, only gives you one answer. How that answer gets to you is again made by an algorithm, that had some key design stages, AI training, and business model boardroom influence that is very opaque. We will be giving up our whereabouts, what we look at, how we speak, and our physical condition to ‘free’ platforms and getting very little in return that is useful to the learner, or at least giving preference to those systems and companies who seek to benefit from your data. It is a scary thought to extend the earlier example of insurance companies mining your Facebook posts for insurance fraud to the data mined from say a fitbit to a potential heath insurer.
For learners, more good things will arise for learners. Imagine students getting just-in-time and contextualized learning content based on where you are, who you are with, and what you are doing. Traditional learning in the form of lectures and tests is already done for younger students, they just ironically only have to do it at schools. But this will mean much more content will be needed, more content to be relevant in more situations and more learning contexts. A legal and ethical battleground will be on the trade-off between getting student data mined and giving relevant contextualized learning content. We will need to stop the collection of data of collection sake, at least I hope, for students and especially young learners. Taken some clues from Foursquare CEO, Jeffrey Glueck, I have updated a set of values that I believe should be the starting point of policy for companies and schools moving into immersive education in augmented and virtual learning environments.
Values of Immersive Learning Data Collection
- Hippocratic oath of data collection (do no harm)
- View data as a privilege
- Informed Choice/consent (opt-in or opt-out)
- Adding Value to END USER (collect data for a specific stated purpose)
- Control to user (right to be forgotten, where can port data)
Moving forward there are even bigger questions on the horizon to come. As we integrate our technology into our bodies and become more deeply connected to our media. These issues of ethics around immersive learning will likely carry forward into the next wave of technology enhanced communication.The Ethics of a Brain-Computer Interface in VR | Digital BodiesThere was no better way to wrap up the GDC 2019 conference than Valve Software’s Mike Ambinder talking about a…www.digitalbodies.net
In the SXSW I make more suggestions about things education leaders can do/support to make this transition into immersive learning more equatable for more people. Ideas of privacy by design for developers, net neutrality legislation for law makers, and digital citizenship education for our next generations are just some steps to start moving on now.
The lines between digital and real will blur, and I feel we need to start making some bigger moves to prepare our society for these major shifts in how technology will allow/force us to live. And the more preventative action we take today, may make the difference between a Star Trek future or a Soylent Green future.