We Donated Our Brains to AI: The explosive rise of AI hangs on the bedrock of data. Data, in turn, culminates from the internet – a tool that’s chiseled by our hands. So whether you’re binge-watching Netflix or scrolling through social media, you’re contributing to the data sets that fuel the AI boom!
Table of Contents
The Latest Buzz
An analysis by the Washington Post reveals that the AI industry has tapped into a vast 30-year treasury of web publishing for their machine learning models. From training AIs to powering neural networks, this public data set has revolutionized the field. Get ready for the AI uprising!
Why It’s Important
You may have never thought about it, but your everyday online activities contribute to the development of AI chatbots. Whether it’s creating a blog or participating in a Reddit thread, your words help educate and shape the future of chatbot interactions. So next time you’re online, remember, you’re not just communicating with humans, but with the smart machines of tomorrow too.
Online content creators are facing a daunting challenge as their creations are repurposed, bringing to the fore yet another thorny legal debate on the rippling effects of these shifts. This issue is not only transforming the digital landscape with a legal brawl on fair use, but also changing the conversation of millions of content creators whose postings have built today’s online world. Hold onto your seats for the ride ahead!
How AI Chatbots Unravel Deep Thoughts on the Internet?
Sharing our thoughts and minds on the internet has led to a world where chatbots powered by AI unearth the depths of our collective consciousness, unravel mysteries, and transform them into unique, fresh versions of themselves. Join the conversation and discover a new dimension of expression!
- But here’s the kicker – we were also unknowingly building a rich database of human expression, that’s incomplete but oh so valuable for the AI chatbots!
- This database is what makes the sentence-completion acrobatics of ChatGPT and its peers possible. Cool, huh?
Visual AI tools like Stable Diffusion, Dall-E, and Midjourney garnered immense popularity before verbal chatbots like ChatGPT took off. Consequently, photographers, illustrators and fine artists were the first to come to terms with this realization. As a result, visual creators are increasingly using AI-powered tools to enhance their craft, opening up new possibilities and avenues for creative expression.
- As AI generates increasing replicas of musicians’ works, artists come to a realization – much like the rest of us – about the power of technology. This emerged with the creation of a fictitious yet tantalizing collaboration between Drake and the Weeknd last week: “Heart on My Sleeve.”
While many of us have tried our hand at music and art, far more have expressed themselves through a few strokes of the keyboard. From social media to webpages, the internet offers a platform for infinite expression.
- Wondering how much a website contributed to Artificial Intelligence’s development? The Washington Post project allows you to investigate any internet domain name and its impact on a dataset used for AI training. Please note that this is not the same dataset used for OpenAI’s ChatGPT and other projects, as OpenAI has not shared its training-data sources.
- The Washington Post used a dataset of more than 500,000 personal blog posts, constituting 3.8% of total “tokens” in their data, to analyze language patterns. Unfortunately, postings on Facebook, Instagram, Twitter, and other social media platforms were kept private by the companies. Excitingly, this study offers unique insights into language use in personal blogs.
Did you know that these enormous training databases used for AI models are not always representative of all cultures, groups, and subjects? It’s a serious issue. A lot of them are unfairly neglected, while others are oversampled. To make matters worse, the biases, limitations, and toxic aspects of the internet culture can easily end up tainting the AI training data. It’s an alarming concern that we need to address.
My Speech balloons: As I perused the Post data set, I found that my 15-year-old personal blog, My Speech Balloons, is well-represented. Interestingly, most of my 10-year-long stint with a web magazine I co-created, has also made it to the dataset. Looks like these write-ups are in for some exploration!
- Discover your online presence with the irresistible self-lookup opportunity provided by the Post’s research, similar to Googling your name. Check out the visual tool named “Have I Been Trained?” to learn more about yourself online.
- As soon as you stumble upon your work listed somewhere, there’s a high chance that you’ll get flooded with an array of questions such as “Is this what I signed up for?” or “Would things have been different if I was consulted?” Questions like these are not uncommon, and many people have been caught off guard by unexpected turns of events.
Explore how AI is transforming the 30-year history of the internet by digging into our digital stockpiles.
- From emojis to blog posts, the digital age is fueled by user-generated content. But as artificial intelligence (AI) advancements soar, it’s the ultimate power-hungry machine’s insatiable appetite for our data that has us seeing the internet in a completely new light.
- Without access to these landfills of information, some of today’s most exciting breakthroughs in AI would still be a pipedream. However, these digital stockpiles weren’t created for AI, but for each other.
Gazing from the present day, we appreciate the unintended but consequential emergence of vast “corpuses” of data following the web’s rise to fame. Back in ’95, the world fell in love with the “www” and browsers, and ten years later, blogs and the “wisdom of the crowd” were all the rage. But who knew that AI training data would be the outcome?
- Fast forward to the early 2010s, when a machine-learning revolution was dawning. Some forward-thinking experts anticipated the implications with unease.
- Yet, only a few truly comprehended the web’s shift towards being a training platform for AI.
Today, we’re confronted with the unintended consequences that come with our online experiences – and this is just the beginning. Our relationship with AI will inevitably shape our future, and the ramifications are still beyond our grasp.
- Consider this: Floods of AI replicas on public networks could stomp out inspiration for artists across the globe, stifling creativity and cultural advancement.
- If our current output ends up being the last original contribution of modern-day humanity, future AI models will be stuck in a loop of outdated content, derived from the years between 2000-2020, creating a flawed depiction of our society for generations to come.