Weekly AI Update: Autonomous Vehicle Deviates from Route

Titles This Week

In a significant turn of events, a Washington D.C. judge has determined that AI-created art does not qualify for copyright protections.
Meta has made public SeamlessM4T, an innovative speech and text translator tool that supports numerous languages.
Recent studies have exposed how content farms are employing AI to lift and reuse news articles from established media outlets. A conversation with one of the investigators who uncovered this issue can be found below.
Lastly, Stephen King has expressed his thoughts on the fact that his books were used as material for AI text-generating systems.

The Lead Story: Cruise's Setback

Weekly AI Update: Autonomous Vehicle Deviates from Route

For quite some time, the tech industry has been promising us self-driving vehicles, but imperfections in the technology have consistently hindered our aspirations. However, in the past few weeks, it seemed as though our dream of a driverless future might finally become a reality. On August 10, the California Public Utilities Commission granted expanded operations licenses to two major "robotaxi" companies: Google's Waymo and General Motors' Cruise. Both companies, who have been conducting autonomous vehicle tests in the Bay Area for years, were given the green light to establish their operations and generate revenue from their driverless vehicles.

This development was enthusiastically received as a significant breakthrough for the autonomous transportation industry. According to the CPUC ruling, Waymo was granted permission to operate "commercial passenger service" through its self-driving vehicles, enabling them to roam freely in both San Francisco and certain areas of San Mateo county, at any hour of the day, and at speeds up to 65 mph in all weather conditions. Cruise was granted similar privileges in San Francisco, with a speed limit of 35 mph. Moreover, neither company is required to hire human "safety operators" to assist the autonomous vehicles.

Following this triumphant announcement, the self-driving car industry appeared poised for unbridled growth. However, a disappointing chain of events quickly put a damper on the celebrations. Late on a Thursday night, one of Cruise's autonomous vehicles collided with a fire truck in the Tenderloin district, sending a Cruise employee to the hospital. Shortly after, another Cruise vehicle stalled at a city intersection, causing significant traffic delays. Overnight, Cruise's recent successes seemed to evaporate. On the following Friday, the Department of Motor Vehicles ordered Cruise to reduce its fleet of vehicles in the city by half, citing "recent concerning incidents" with its vehicles. Obeying the order, Cruise reduced its fleet by 50%.

This turn of events leaves the autonomous transportation sector at a peculiar juncture. With regulatory barriers removed, self-driving cars may soon become an integral part of our daily lives. The future we've been promised includes a fully automated luxury travel experience, with autonomous vehicles ferrying passengers down the highway while they relax in the driver's seat or watch movies on their iPhones. However, it remains unclear whether this is genuinely what the future holds, or if self-driving vehicles will primarily contribute to traffic congestion, accidents, or even worse situations.

Barry Brown, a computer science professor at Copenhagen and Stockholm Universities, shared his insights with Gizmodo. Although self-driving cars have shown remarkable potential, he cautioned that they still lag behind in one crucial area: interpreting the intentions of other drivers on the road. Humans excel at this task, but AI-powered self-driving cars struggle to comprehend the subtleties needed to navigate complex social environments.

According to Brown, self-driving cars have limited success in predicting other drivers' movements. He explained, "They struggle to understand other drivers’ intentions. We humans are actually very good at doing that, but these self-driving cars really struggle to decipher those nuances."

The challenge, from Brown's perspective, lies in the nature of roadways as intricate social ecosystems teeming with subtle cues to guide drivers. Self-driving cars are currently not adept at reading these cues, making them more akin to inexperienced drivers still learning the ropes.

Brown emphasized, "We don't let five-year-olds drive. We wait until people are at an age where they have substantial experience in understanding how other people move. We're all seasoned experts at navigating through crowds of people, and we apply that expertise when we drive as well. Self-driving cars are quite proficient at predicting trajectory and movement, but they struggle to adapt to subtleties of other road users to comprehend what's happening."

Complicated urban environments present an even greater challenge for autonomous vehicles, as they grapple with basic issues such as yielding rights and handling mixed traffic including cyclists, pedestrians, and densely populated roads like New York City. Brown contends, "These issues intensify and become increasingly complex."

This week, we chatted with Jack Brewster, an experienced analyst at NewsGuard, whose group recently published an eye-opening study on how dodgy websites utilize AI tools to steal news content from established media outlets. The study exposes the peculiar realm of AI content fabrication, revealing that some websites seem to have fully automated the article-production procedure. They employ bots to swipe news sites, subsequent using AI chatbots to rephrase that content into accumulated news. This information is then profited from through advertising deals. This conversation has been shortened for readability and clarity.

How did you initially discover this pattern?

We've been monitoring a type of site we call UAINs—unreliable AI-generated news websites. Essentially, these are sites that appear to be cutting-edge content mills, utilizing AI to produce content. As we were examining these sites, I noticed numerous publishing mistakes [the articles contained clear indications of chatbot usage, for example, "As an AI language model, I am unsure about the tastes of human readers..."]. I became intrigued with how many sites were employing AI to accomplish this—and that was essentially how it all started.

Walk me through the AI plagiarism procedure. How does someone or a website steal a New York Times article, input it into a chatbot, and create an "original" story?

One of the key points to emphasize here is that many of these sites seem to be carrying out this task automatically—meaning they've completely automated the duplication process. It's likely that the programmers for a site set up code to target a few specific websites; they deploy bots to mine those sites for content, and then feed the data into a large language model API, like ChatGPT. Articles are then published automatically—no human intervention required. That's why errors appear, since the process hasn't yet been perfected, at least not for the sites we investigated. Needless to say, the next question is: if these are the more careless sites, how many other sites are a little more meticulous and editing out those error messages, or have optimized the process to make it totally seamless?

What are your thoughts on the consequences for the news industry? You could argue that, if this phenomenon grows too large, it'll be draining an enormous volume of web traffic from reputable media organizations.

I'll make two points. The first and most critical issue for the news industry is to define this phenomenon...Is it improved plagiarism or simply efficient data collection? This is something that the affected news outlets should address, and also something for the courts to settle. The second point is that...[this trend] impacts our information ecosystem. Even if these sites aren't spreading misinformation per se, if they keep expanding at an exponential rate, it'll become increasingly difficult for the average person to distinguish high-quality information from low-quality information. This will make our reading experience more challenging and access to high-quality information more difficult.

What about the AI industry? Do AI companies have any responsibility to assist in resolving this issue?

One thing I will say is that one of the significant things we encountered during our research was watermarking...that was one of the safety measures we discovered while exploring potential safeguards to prevent this. Again, this is something for politicians, government officials, and AI companies to decide whether they want to tackle this issue.

Do human journalists need to be worried about this? A considerable portion of the journalism industry now revolves around news collection. If AI can do this effortlessly, doesn't it seem probable that media companies will shift towards this model because they won't have to pay an AI model to generate content for them?

Yeah, I guess what I'll say is that it's possible to envision a world where a few sites produce original content and countless bots copy, rewrite, and disseminate variations of that original content. I believe that's something we all should be concerned about.

Considering the advances in technology and artificial intelligence (AI), experts like Barry Brown argue that self-driving cars still need to improve in interpreting the intentions of other drivers on the road to ensure a safer and more efficient future in autonomous transportation.
The revelation of AI-powered content fabrication in certain websites, as exposed in a recent study by NewsGuard, raises questions about the integrity of information and the impact of technology on the news industry, highlighting the need for responsible use of AI tools.