Skip to main content

Search This Blog

Engineering Scientist Awards

How AI Could Soon Take Human-Computer Interaction to New Levels

Get link
Facebook
X
Pinterest
Email
Other Apps

- June 18, 2025

As AI models reach excellence in speech recognition and synthesis, text processing, and multimodalism, the ultimate voice-user interfaces…

Luciano Abriata

Voice User Interface (VUI) for natural speech-based human-computer interaction as imagined by Dall-E 3 via ChatGPT.

It was a typical Friday afternoon right at the end of a long week of work on our project developing a radically new concept and app for molecular graphics in augmented and virtual reality, when I found myself in a heated discussion with my friend and colleague. He is a "hardcore" engineer, web programmer, and designer who has been in the trenches of web development for over a decade. As someone who prides himself on efficiency and control over every line of code and especially who always has the user and user experience in mind, my friend scoffed at my idea of voice interfaces becoming soon the norm…

"Speech interfaces? They’re immature, awkward, and frankly, a little creepy", he said not with these exact words but certainly meaning them, and voicing a sentiment that many in the tech community share. And this was already after having kind of convinced him, maybe by 30–50%, that our augmented / virtual reality tool for molecular graphics and modeling absolutely needs such kind of human-computer interaction because since the users’ hands are busy grabbing and manipulating molecules, there’s no other way for them to control the program, for example to run commands and such.

More broadly, speech-based interfaces (or Voice User Interfaces, VUIs) can be a game-changer for various work or entertainment situations where hands are busy, and to facilitate accessibility for people with various disabilities whereby together with regular GUIs they would be inclusive even with the visually, auditory, and motion-impaired. All these points make the topic very important to be discussed and evaluated from the viewpoint of technology and UX design, and we must do this often because of how fast technology evolves. Moreover, as I will discuss here, I think it is getting to a point that it can already be pushed for, contrary to my colleague’s viewpoint which still remains quite negative.

I do acknowledge, though, that my colleague’s concerns aren’t unfounded. He argues that speech interaction with computers is still plagued by inaccuracies, a frustrating need to repeat oneself, and a general lack of fluidity. And to an extent, I do know he’s right. (But… read on!)

A short but relevant detour: Voice User Interfaces as imagined by Star Trek

While I debated with my colleague about the current limitations of speech-based interfaces / VUIs, I couldn’t help but think of Daley Wilhelm‘s articles exploring the future of UX, in particular in her insightful piece titled Did Star Trek Predict the Future of UX?

Did Star Trek predict the future of UX?

(and by the way, I also recommend her article Designers: you need to read science fiction)

In her article on Star Trek predicting the future UX, Daley Wilhelm discusses how the VUIs in Star Trek set user expectations for technology, shaping a big chunk of how we interact with our devices today. The seamless, intuitive voice commands that the crew of the Enterprise use to control their ship represent an ideal of what human-computer interaction could be… talking to the computer just like to another human. Star Trek got right the iPads and hand gestures, and even some aspects of multitouch displays, so… did it guess right the future of VUI, too?

The exact same thing, even more advanced, with Lt. Commander Data, a highly sophisticated android from Star Trek: The Next Generation or the Emergency Medical Hologram Doctor from Star Trek: Voyager, both capable of sustaining very complex conversations – and even using speech-based thinking themselves (Further detour: Are human/artificial language models linked to human/artificial intelligence?)

Back to Daley Wilhelm, her key point is that while Star Trek‘s vision of the future was ahead of its time, our real-world technology hasn’t quite caught up – at least, not in the way the series imagined. In Star Trek, the crew interacts with the ship’s computer largely through voice commands, whether it is to access information, control ship functions, or even replicate food and beverages – yet with limitations as she exemplifies.

This vision of a future where voice interfaces are the primary mode of human/robot/hologram-computer interaction is captivating and, for many like myself, an aspirational goal. And leaving aside my subjective opinion, there are all the advantages I posed in the opening paragraphs.

In Star Trek, the conveyed ability to issue complex, context-rich commands and receive accurate, timely responses seems like a natural extension of technology’s potential. For example, Captain Picard could request for a specific flavor of tea, at a specific temperature, and instantly receive exactly what he wanted – no fuss, no misunderstandings. But as [Daley Wilhelm](None) points out, modern voice assistants like Siri, Alexa, and Google Assistant struggle to meet these expectations, and by quite far. Today’s users often find these systems falling short of the conversational, context-aware interactions that Star Trek made us dream of. On the other hand, Daley Wilhelm presents an example of Star Trek’s computer not really understanding the user, when Geordi LaForge asks the computer for music with a "gentle Latin beat" and the computer initially fails to deliver the exact type of music he had in mind, highlighting the challenges of ambiguity in natural language processing. I quote this specific example from her article because I will come back to it later on in the context of modern (real-world, 2024) technology.

But my point is that the limitations discussed by Daley Wilhelm resonate on first look with many users and developers today, including my colleague. Unlike the seamless interactions depicted on Star Trek, our current VUIs often stumble over complex queries, struggle to understand context, and sometimes return irrelevant or incorrect responses. The reliance on recall, where users need to know exactly what they want to ask or command, contrasts sharply with the more natural recognition-based interaction that users typically expect. Thus, when using modern VUIs we often find ourselves needing to adapt to the technology – learning specific commands or phrasing questions in ways that the system can understand – rather than the technology adapting to us. But my point, to be developed soon below, is that current technology has much more to offer and probably isn’t that far.

In particular, note that since Daley Wilhelm published her article, the technological landscape has evolved quite rapidly. For example, by January 2023 when her article was published, OpenAI’s first really large and "smart" language model, GPT-3, had just been released a few months earlier – and I tried it through the API, that is before ChatGPT came out, astonished at the possibilities it could open up for more fluid and natural VUIs:

Get link
Facebook
X
Pinterest
Email
Other Apps

Comments

Post a Comment

Popular posts from this blog

Drone radar facilitates agricultural monitoring

- October 23, 2025

In agriculture, drones distribute seeds or fertilizers with precision, saving both time and money. When combined with sensors and artificial intelligence, they can perform remote sensing, similar to satellites and airplanes, monitor large crops, conduct detailed analyses of soil’s chemical elements, and identify issues such as erosion. One of the most advanced techniques in this field, Synthetic Aperture Radar (SAR), has been enhanced by Radaz, a startup founded in 2017 at the University of Campinas (UNICAMP) and now based in São José dos Campos, São Paulo. “It’s an innovative technology with enormous potential to generate a wide variety of products, serving various market segments,” says electrical engineer Hugo Enrique Hernández Figueroa from UNICAMP’s School of Electrical and Computer Engineering (FEEC), who led the team responsible for developing a radar mounted on a small drone. To achieve this, the team had to miniaturize the radar’s electronic hardware and antennas. The id...

TVS Motor Sets Up Global Design & Engineering Hub in Italy with Acquisition of Engines Engineering SpA

- September 26, 2025

TVS Motor Company, a leading international manufacturer of two- and three-wheelers, has announced the creation of a Global Centre of Excellence (CoE) for Design and Engineering in Bologna, Italy. This strategic move is supported by the company’s complete acquisition of Engines Engineering SpA, a renowned Italian automotive design firm recognized for its expertise in advanced prototyping, innovation in high-performance motorcycles, and MotoGP racing experience. The new CoE strengthens TVS Motor’s vision to develop premium, future-focused mobility solutions worldwide. It will function as an end-to-end innovation hub, combining Engines Engineering’s technical strengths with TVS Motor’s global R&D network. By bringing together diverse teams and tapping into global talent, the CoE aims to accelerate product development, enhance differentiation, and solidify the company’s leadership in technology. The initiative also extends support to Norton Motorcycles, TVS Motor’s iconic British bran...

Japan to increase reliance on nuclear energy

- March 04, 2025

Japan’s Cabinet has approved a new basic energy plan emphasising nuclear power and renewables as its primary carbon-free sources to ensure energy security and achieve net-zero emissions. The government revises its energy plan about every three years. The plan is based on the Basic Energy Policy Law enacted in June 2002. The new 7th Basic Energy Plan reverses the government’s earlier resolve to minimise reliance on nuclear energy following the 2011 Fukushima crisis. Nuclear power will account for about 20% of the Japan’s total energy output in fiscal 2040, similar to the fiscal 2030 target of 20-22% and a significant increase on the 8.5% in fiscal 2023. Electricity demand is expected to rise due in part to the prevalence of artificial intelligence and data centres. The government aims to meet high demand by easing requirements for replacing old reactors with new ones. When rebuilding at a plant site is not possible, power companies will be allowed to establish replacement reactors...

Powered by Blogger

Engineering Scientist Awards

Archive

February 202612
January 20266
December 20259
November 20253
October 20254
September 202511
August 202518
July 202515
June 202521
May 202527

April 202523
March 202524
February 20254

Show more Show less

Report Abuse