How do you verify output of thing more intelligent that you ? That’s Dixit’s dilemma of Generative AI. Its not famous yet 🙂 .
But as we employ Generative AI more this dilemma will become famous . Take the case of code generated by GenAI , you need the developer using it , to be more skilled than the complexity of the code being generated to find out correctness of the output . A corollary of Dixit’s dilemma is that our Qualitative gains from GenAI are capped by the Quality of the verifier !
Did someone talk of GenAI making your job redundant ? It can in the quantitative domain but not in qualitative domain (it the mundane can be but the classy can’t be ).
EDIT : Some more clarifications
The holy grail of code generation .
The holy grail that we all are chasing from tools like copilot is that they can generate the whole code from requirement documents .This is not a stated need but that is what people are hoping for . read on .
If you see the popular demos on code generation by GenAI tools there are few scenarios that occur often :
1. Given a table or json generate complete webservice
2. Find the performance or security bugs in code
3. Find structural issues with code as per the language specification
4. Flag issues with dependencies from deep within codebase
5. Document my code
6. Write test cases , elementary ones .
Now if you see the capabilities of IDEs , code generators ,profilers ,linters such things existed since decades for most of the programming languages .But they were in siloed in nature .So we can give credits to LLMs tools for bringing it all together .This can benefit existing codebase a lot .
The situation becomes interesting for new code base . The demos where one can make the tool write a script for managing a server or sorting algorithm are in essence demonstration of smart lookup (for relevant code fragment ).This can benefit seasoned developers by saving few keystrokes .What next ?
What do we all expect next is , can the tool take long format requirements and generate functional code .That is to say can it read my Jira requirements and understand my user persona , interfaces , domain jargon ,different flows so on and so forth . It is certainly passible to give lots of prompt context to AI tools and create such a demo .
Can we do it at scale for an organization ?
Can we do it in a manner where prompting the requirement context doesn’t becomes as complex a job as that of programmer ? AND
How can we validate that what ever is generated is functionally correct without needing a new battalion of validators ?
These are the practical question for which we don’t have answers .
This is not to say its not passible , domain languages for automatic code generation has been in place for long . Coupled with GenAI they can do wonders .Its just the current breed of GenAI are not made for this sole purpose .The day that happens , all the GenAI demos will happen to CTOs instead of CEOs 🙂 .Read that again 🙂
Guess this wording settles the debate against the developers . The narrative has been set for use of LLM for productivity boosts with this phrasing !But are developers burdened with expectations or is there some hope of real usefulness to developers with the help of AI?lets check .
Using llm for coding task is expected to give huge productivity boost . Mckinsey already released a report on measuring developer productivity .Organizations like ChatGPT/OpenAI/Microsoft and the likes of meta/CodeLLma are coming up with code LLMs and numbers on productivity boost .There are new coding LLMs or LLMs with such capability coming up every week ,so this list is non exhaustive .
Paid to code or paid to recollect ?
However the question remains , can LLMs in its current form boost developer productivity across the board . The answer is no .Its only the developers who have a clear picture in mind of what they want ,can save some time with typing work with tools like copilot .
However the cognitive work for the developer now shifts towards reviewing the code generated by AI .Which is heavier mental work then writing code via learned skills .
This also raises a question of memory recall of such AI generated code at scale ! .Developers after all are not paid just to “code it” . They are valued for knowing their code , maintaining it and debugging it . Will the developers have a clear mental model of such AI generated code ? and what is the limit of such active recollections ? For now we we don’t know answers to these questions .We can take some guess from our experience with social media .This experience tells as we honestly don’t have first class recall of the virtual updates from our network when compared to actual moments spent in physical world. Even if someone argues that social media is not approached with the serious ness coding is ; the limits of our fatigued brains are known now. So I would conclude that the so called boost to developer productivity is overhyped .
The Future of coding with AI
There is however a good case of vendor curated assistants .Imagine an coding assistant from spring/java or vuejs or likes. These can help us with smarter code highlight ,offer deeper code review and provide standardized lookup code samples for developers to use. They can also have a real connection to the language runtime and offer are better suggestions on optimizations and debugging .
This is probably the less shiny ,less geeky middle path to developer productivity with the help of AI. Exactly in the spirit of Pair Programming with a Large Language Model .
Remember “talking tomcat” ? This was one of the unsung hero’s that made android OS popular .A cartoonish cat that can echo what you spoke in cat voice .This was a fine demonstration that with prevailing hardware android could replay voice with modification . Of course the lay uses would not word it this way but they got the point anyhow .This is exactly how technology goes mainstream . What followed tomcat is set of apps that could then add dog face to your picture then to your video and so on .By then the novelty had faded , the capability was taken for granted .People expected mobiles to this much at least and then moved on .
Users perceive ChatGPT as bright teenager
ChatGPT in its current form is talking tomcat of AI .This is first time common lay person is getting a demo on how much more can be done with AI . For all he knows ,its nice chatbot that can do brainy stuff . How brainy you ask ?
This is important question to ask .If we were to equate the “general” feel that ChatGPT gives to most common interactions people have then what would it feel like ? Remember spelling bee contest that are held in USA ? For most of the lay users chatgpt feels like a smart teenager that is G.K bee (in that sense). This is how consumer mass market sees innovation . Simplified and equated to mundane things in his/her life .
Two directions LLM needs to evolve
LLMs for consumer use cases
The above discourse make it clear that large language models(LLM) need to evolve and also be understood in 2 distinct set of parameters . One is the consumer angle . Taking a leaf from how android was seen or how voice assistants were seen ,LLMs for lay consumer simply means that computers can now answer diverse and more complex answers(#1).This also implies that how much ever the the press focuses on ChatGPT , the future of LLM is in the usage driven consumer space . These are specialized models that do one or few things in one are with unambiguous and immediate utility . Imagine an app that can take your picture or live videos and suggest fashion makeovers to you (ideal copyrighted hereby 🙂 ). Or take for example BoomberGPT that aims to cater to offer targeted consumer utility for end users . Similar such gpt models can be built around legal advisory ,medical first line help ,cultural adjustment needed during travel .A general LLM that can filter money laundering names can makes life easy for regulators .
OpenAI is aiming to be THE general purpose engine for all such use case via its plugin architecture .Can it succeed in giving curated user experience is matter of debate with ChatGPT 4 .With future versions of ChatGPT things might change .But it can also be case of diminishing returns where the model size and compute cost cant justify future refinement . As far as end users of AI are concerned they are interested in the utility than specifics of the software internals .
LLMs for AI community
Information ownership and privacy leakage are tow important issues any LLM has to handle .We have learned many lessons from years of legal cases and government request of page take downs to search engines . Once the hype subsides the LLMs fed on public information will soon get into all of this mess .
And don’t even think what will happen if ChatGPT gives an answer that is blasphemous in some culture . This is my main reasoning as to why GPTs in chat mode wont harm Google’s search business .There is need for sanitizing ,curating and localizing the outcomes and none knows it better than google .Just that they need to offer same LLM toppings on their pizza too .
But as community we need to keep pushing the boundary on parameters .Efforts will also be made to plugin knowledge representation (universal or specific ) with LLMs for more deterministic answers .Size/cost optimization and Realtime model updates at this scale and geodistributed LLMs are few directions in which efforts can go
Premature Universal Knowledge claims by LLMs PRs
Its not the AI scientist but the PR machines that are claiming that we are very close to general intelligence .So far computing is concerned LLMs have given a feel that they are generally intelligent .We must remember that Googles LamDA was the first LLM that was said to be sentient (funny how google lost the PR battle ). So on the basis of “feels like human” the LLMs have started giving a feel that it is human level or intelligent or both . Moreover given a focused effort a “self” neural module can be built into LLMs .Say an LLM that can sense that its cloud billing is crossing the daily threshold and its starts feeling tired now .
This is funny example but its tells you the inherent problem with sentience of machines .Without change and limitations that living being experience , machines can achieve plant level equality to being alive .For animal level behavior they need to have ability to grasp animal concepts staring from reptilian to mammal and then human Brain .And also the concept of emotions that affect their whole existence (as opposed to giving a feel of an emotion).
So far the end user experience is concerned LLMs in current form do “feels like human !”
The second challenge is do LLMs have universal knowledge .Any one working on web search or elementary ML knows the answer is NO. Current LLMs are limited by the thin slice of information it was fed .So in reality this is more of a media claim than anything an AI scientist believes .
Societal Impact of post LLM era of AI
How has TV or mobile or internet affected humans/Students/Kids ?
The cognitive-behavioral impact that above waves of revolutions had on humans will be further multiplied by the capability expansion brought by AI (Apart from LLMs , image search was one such capability expansion but it was under hyped). So this issue and the debate and the remedies that follow are known to us .
However the issues of cultural, individual and situation sensitivity is something that the centralized models are not geared up to handled . Nor are the efforts behind them are aiming to .So good number of “situations” where “feels like human” AI did not really “work like human” will come up in coming decade .
New AI frontier
LLMs have not expanded the frontier of AI as field .However they are first class coming of age story for the community .As next level evolution AI can now evolve into two directions .
Personal Models
Current efforts in AI designs come from corporate style centralized AI desings .If there is any effort where a personal model can exist it will be more revolutionary than LLMs . A sort of “AI thing ” that stays with individual and monitors and learns and advices him/her .Imagine your fitness tracker which can suck data from your online activity and also listen to your speech and brain MRI .The corporate business case for this is lacking but the challenges pursuit of such “AI thing” can have on the AI community is huge .
Architecture for Sentient AI in 2026?
Leaving aside the debate of whether we really need it ,once the LLMs are seen as normal a concerted effort of AI labs can work on developing new neural architecture ,that learns from evolution on sentience in living organisms .Whether we will succeed or not ,AI community deserves to pursue its own “voodoo doll” moment like all branches of science .In fact if present AI labs gets enough money they might work on it sooner than 2026 .It is one of those effort worth failing .
When it comes to AI and general programmer/Technology Architects ,one question I get asked is how much of AI internals they should know ?
The answers have few nuances . A budding programmer or framework aspirant these days has to be fullstack .Like it or not that is a job market reality . It takes good couple of years for someone to be respectably good full stack i.e Frontend -services- db .Interestingly cloud used to hyphenated in this equation as skill .Not now. Now the cloud is treated at par with your eclipse/intelliji . At the same time distributed computing has become the umbrella under which all job candidate now stand.
I would suggest that for generalist programmer technology architects AI will end up in same league as cloud .2023 is about time it a happens .
AI awareness becomes commodity skill like cloud
Detailing more on the path towards AI aware software engineer .
Most of the programmer community know elementary stats from graduation which can be brushed up to grasp what we call as machine learning .
It will also help so understand the piece on data engineering as for a generalist this is one common intersection point with AI (integration) .
However when it comes to deep learning it gets interesting .As such a the details of how a neural network works has no direct impact on daily work of a programmer/architect .But as a learning opportunity as well as future of programming frameworks we need to watch out .
The design of a tensor of tensor flow is a good proxy for designing your own interpreted language . The RNN,CNNs of the world are not only delight to study but they also are a possible direction as to where our big data or distributed computing might evolve .In fact there are already segments in AI community which are working on distributed learning (compute) framework .I see no reason as to why the deep learning community and traditional language-framework creators wont exchange notes soon and copy .
With that as a direction ,
Here is a simplified Architecture diagram of GPT of chatGPT fame. This one outlines the essential elements without dumbing down the huge amount of work .
There is one social media post that keeps doing round. It says that some NASA scientist claimed Sanskrit is most suitable language for AI.People laught at it as fake news. Here is link to 1985 article that started it all. But why is this fact not underlined loudly you may ask. Multiple reasons.
First the fact that sanskrit has oldest codifed grammer ,some 3,4 BC are attributed to Panini who worte it. The main feature of this language is that it denotes same meaning even if you put words in any sequence. This happens via reflectivity .In simple term the pre-post fixes of noun applies to verb, adjective and derived forms. This is very neat preservation of semantic meaning. For other aspects of grammer this language has very logical,maths like rule.Which feels like a programming language.
Now, this aspect of sanskrit is given mention in intro sections of big books on compilers and pasrsers.Including the lagend MIT professor Noam Chomsky whose language theory formed basis of Natural Language Processing. But world has moved of from prolog and lisp to more sophisticated models of NLP. So the early theory of the field wont get mention in new papers ,there is no motive here. Same has happened to works on compilers.
Cut to today our early NLP libraries built on work of many researchers after Chomsky and relied on typical parse-toekn-interpret sequence. The current champion of NLP ie BERT was celebrated for bi directional application of relationship which allowed it for better correlation of words in terms of their meaning. This feat actually derives from Sanskrit but its not an exclusive feature of sanskrit.So we dont have a reason to find prejudice.Nor can one person read all research papers where it might be mentioned. Now the next phase of evolution of NLP is to move from sentences and arrive and comprehension at higher aggregate level.Like how you and me can understand poetry in all of its abstract metaphorical erratic flow.This calls for not only processing of language in terms of words-relation-meaning but also calls for iterative dive into parallel knowledge models and alternate meaning.Think of reading a satirical poem…. When the next production strength NLP winner, will be published it will derive from the reflectivity Sanskrit has; but its mention in the research paper will depend on the prior work the researcher has referred. So,as much good the ancient languages is the way research methodology and citation works,we dont have to always suspect motives .
ps: I have tried to give very simplified view of NLP and Sanskrit here so experts in the field should pardon the simplified version of things.
Link to original paper of the images shown here : https://ojs.aaai.org/index.php/aimagazine/article/view/466
So , 2016 was year of chatbots . See the graphic at the end (its big image )and we see so many of the mainstream companies having built their bots . A github repo search will reveal similar story.The buzz that chatbots are creating is huge .So much so that people are claiming that chatbots will kill websites and mobile apps soon .We even notice that similar buzz was created a decade ago when mobile apps and app stores became mainstream .The mobile app wave was also looked at but incredulity so it is natural to be more welcoming towards chatbots wave .But the similarity does not hold beyond the English sentence you just read.
The move from website to mobile was actually reshaping of the form factor of the computing device .It is but natural ,albeit in hindsight ,that the content and its delivery fits itself into the new form.In the mobile wave also , the mobile and the app , were used interchangeably .While the mobile represents the shift , the app merely represents engagement .This difference is vital to analyzing the chatbot buzz .
Matt Schlicht,of the Chatbot Magazine defines,“A chatbot is a service powered by rules that a user interacts with using a chat interface. The service can be any number of things, ranging from form to function and live in any major chat product” . One might agree with this definition as is or diff with it in parts .While chatting itself is not a new paradigm nor is the concept of a daemon processing accomplishing the task , it is the combination of a always running agent , the bot facilitating the chat.The daemon could assigning your chat requests to human beings ,in a typical support center scenario .It might be reading your sentences and applying some pattern rules to serve reconfigured response .And because the state of the commoditized art allows us to program interpret ,human sentences (either typed or spoken), the so called NLP engine can be plugged into the chat bot to enhance the precision to the intent inference .
Most of the NLP styled chatbot guides mention NLP and intent-action mapping in the same breath ,but this is false connection .Intent-action mapping is what it is ,whether the inference is made via NLP,regex or rulesets. One might also throw machine learning in the stack to further enhance the inference via correlation and other techniques .Well that’s the short summary of what we get to read in the chat bot buzz as the first thing. What goes missing many times is the true nature of the chatbot for the end user , i.e. the conversational quality .
It is this conversational quality of the interaction , that needs deeper scrutiny ,because it represents an alternative to the laid out quality of our interfaces .
This model is very powerful the “range” of the expression is huge , like asking for a qualified advice amidst multiple factors .But it is very lengthy when the expressions are straight forward .
A typical chatbot sample told to us will be either a flightbooking bot, a billpay bot or an e-commerce bot.All these interactions are well modeled to human mind and the laidout-selection model fits better there .Infact it is an liberal model (read open) for both the parties to explore more options alongside the intended interaction.
Where as a personal assistant that can suggest a song based on the whether condition,travel duration,earlier playlist usage and so on , is the right case for chat (voice or test of gesture) to digest the complexity of intent-explain-infer-confirm cycle .
Thus the argument that because the end user are moving and are more and more available on chat platform and hence the business process should also move to them will add to chat fatigue once the novelty fades out .
We need to trust the end use to be able to decide on more convenient and time efficient model of interaction and offer them.Hence the conversational quality , used at qualified places while retaining the laid-out quality of presented content is the right blend .A jump into chat via bot will be shortlived for most mainstream businesses .Infact the power of ML or AI ,as we like,implies that the business understand and serve (ready to serve) us better even before we start interacting .To transport this whole responsibility onto AI assisted chat interfaces via chatbots is laziness .
In the end , there is a Google search experience to rescue.It lays out a nice search box for us to express , then does huge work in the background to make best sense of what we intended yet subtly suggest us alternatives and corrections ,like human conversation,if the confidence in the inference it made wasn’t high .
It is good case of fitment driven software than buzz driven software ..and that’s the case against chatbot , AI or no AI.
Here are some samples of how subtly google lays out the conversations , courtesy: littlebigdetails .
1. If you search the word “recursion” in Google, it’ll suggest “recursion”. If you click on the suggestion, it’ll suggest “recursion” again… creating a recursive search.And don’t miss the spell correction prompt “did you mean”
2.Google Chrome – Displays some search results in the suggested input area
3.When searching for an upcoming movie, the Knowledge Graph box shows the release date and asks if you’d like to create a reminder.
4.The Oreily report :
In consumer space.. is bleak.Amazon just launched an Amazon go service  .
The idea is to walk around the store picking items and let the amazon magic wand do the billing and payment for you. Something like how magic bands worked in amusement part.What is important here is not how amazon will identify the chosen products but how it will bill the customer.The experience amazon wishes to give via go service is that of on tap shopping.This mandates that the act of billing your account will be on tap.We can draw an anology with how oauth made user authentication ontap for content providers.This is huge.
The idea of some sort of token of your payment power was well established by cheques and later via plastic cards. The point of sale handled the other part of giving identity of the billing party. The same was replicated by e-commerce sites. However whats was missing here is a marriage of discoverability of both the parties as well as payment completion on the same tap.And all withing the realm of regular banking. We can alternatively say that the ease propagated by maverick fintech of dot com era as well as current breed of startup is going mainstream. All via the power of amazon to shape and coerce new behavior.
The problem that consumer banking apps went on to solve were that of products, payment’s and problems. Its is at the payment level most of the problems and innovations in this area happened. What amazon go will propagate is the paradigm of account as a service ,on which payments can be made. Its kind of stretching the event horizon. Because once the cognitive habits of tap and pay becomes normal,all the problems of discovery and identification become part of a universal,standardized infrastructure that most bank will end up supporting. The shift is so fundamental that the current attempts of standardization of bill payment,QR cheques,universal account identifiers and the payment networks will be forced to coverage and fall in line with account ad service model due to sheer power of end user convenience.
What then is left with banks is products and problems. The products being the universal facility that banks offer and where information technology has little to offer apart from some trivial suggestion based on consumer profile.And the problem part which the support executive can’t get rid of, so much the banks would love to eliminate and are nowadays looking towards bots as hope.
We must also factor in the usual suspects of value chain arguments. In a world where bank account and transfers are available as standardized commodity,there will be an inclination to move up the value chain,but we must remember the very reason of the channel banking app was to outsource clerical load onto consumer and let them do it.What more can then be more up the chain than liberalize payments and accounts further and let the bank focus where their strength lies ie better banking products. So skip the temptations to bring some workflow,fund sharing- social features to your banking apps.The consumers might not,all,need it ;and in the light of account as services the app ecosystem will take care of the latest , mostly transient tides of consumer behavior.