Post by account_disabled on Sept 14, 2023 13:11:27 GMT 3
In the world of technology, everyone is ultimately a parasite. As Dries Wojtart, the creator of Drupal, said a few years ago, everyone is more of a taker than a maker. Wojtart said that a common practice in the open source community is that "takers do not repay the open source projects they benefit from with meaningful contributions, but harm the projects that depend on them." There's bound to be more to take in than to take in.
Parasitic tendencies have also been seen in Phone Number List Google, Facebook and Twitter, which rely on other people's content, but they're even more evident in generative AI today, says Sourcegraph developer Steve Yegge. “LLMs are not just the biggest change since social, mobile or cloud, they are the biggest change since the World Wide Web.”
That may be true, but large-scale language models (LLMs) are parasitic in their fundamental nature. This is because it relies on scraping information from other people's code repositories (GitHub), answers to technical questions (Stack Overflow), and literature.
ⓒ Getty Images Bank
As with open source, those who create, collect, and distribute content have begun to block LLM access to content. For example, as Wired reports, Stack Overflow, which is experiencing a decline in site traffic, is, like Reddit, demanding that LLM creators pay up to use Stack Overflow data for LLM training. It's a bold move reminiscent of the licensing wars that took place in open source and the paywalls that news publishers built to block Google and Facebook. But does it really work?
tragedy of the commons
Of course, the history of technological parasites goes back before open source, but let's look at the early days of open source, where I began my career. Companies that try to profit from other people's contributions have been around since the very beginning of Linux or MySQL. For example, in the Linux field, two companies, Rocky Linux and Alma Linux, have recently promised full compatibility with Red Hat Enterprise Linux (RHEL), but in reality, they do not contribute anything to Red Hat's success. . A natural consequence of the success of these two RHEL clones is that the hosts will disappear and the clones themselves will disappear. That's why in the Linux industry, they are sometimes referred to as the 'masterminds' of open source.
It may be an overstatement, but it gets the point across. This is a criticism in the same vein as the so-called ‘open pit mining’ that was once directed at AWS. The impact of this criticism has sparked several changes to closed source licenses, distortions to business models, and a seemingly endless debate about open source sustainability.
Of course, open source's position is stronger than ever, but the status of individual open source projects varies. Some projects and project maintainers have developed ways to manage 'takers' within the community, but others have not. In any case, the importance and power of open source continues to grow.
depletion of the well
Let’s get back to LLM. Large corporations like JP Morgan Chase are investing billions of dollars and hiring more than 1,000 data scientists and machine learning engineers, reaping huge financial benefits from personalization and analytics. While many companies are hesitant to publicly adopt services like ChatGPT, the reality is that their developers are already using LLM to increase their productivity.
The cost of that effect is now starting to become clear. This is the price that companies like Stack Overflow, which have historically served as a source of productivity improvement, must bear.
For example, according to Similarweb, Stack Overflow traffic has been decreasing by an average of 6% per month since January 2022, and by a whopping 13.9% in March 2023. It's difficult to say for sure that ChatGPT and other generative AI-based tools are entirely responsible for this decline, but that doesn't mean there isn't an impact.
Peter Nixie is the founder of International.io and a top 2% user on Stack Overflow. To date, more than 1.7 million developers have viewed Nixie's responses. Likewise, Nixie, who is famous on Stack Overflow, said, “I will probably never write on Stack Overflow again.” Why? This is because Stack Overflow's knowledge pool is at risk of being depleted by LLMs such as ChatGPT.
Nixie asked, “What would happen if we stopped creating a pool of knowledge together and instead poured it directly into each machine?” The “machine” that Nixie refers to refers to generative AI such as ChatGPT. For example, it would be convenient to get answers from an AI tool such as GitHub's Co-Pilot, which was learned using GitHub repositories and Stack Overflow's Q&A. However, unlike Stack Overflow, Q&A with AI is conducted privately and is therefore not established as a public information repository. Nixie said, “GPT4 was trained using questions posted on Stack Overflow before 2021. “What is GPT6 used to learn?” he asked.
Parasitic tendencies have also been seen in Phone Number List Google, Facebook and Twitter, which rely on other people's content, but they're even more evident in generative AI today, says Sourcegraph developer Steve Yegge. “LLMs are not just the biggest change since social, mobile or cloud, they are the biggest change since the World Wide Web.”
That may be true, but large-scale language models (LLMs) are parasitic in their fundamental nature. This is because it relies on scraping information from other people's code repositories (GitHub), answers to technical questions (Stack Overflow), and literature.
ⓒ Getty Images Bank
As with open source, those who create, collect, and distribute content have begun to block LLM access to content. For example, as Wired reports, Stack Overflow, which is experiencing a decline in site traffic, is, like Reddit, demanding that LLM creators pay up to use Stack Overflow data for LLM training. It's a bold move reminiscent of the licensing wars that took place in open source and the paywalls that news publishers built to block Google and Facebook. But does it really work?
tragedy of the commons
Of course, the history of technological parasites goes back before open source, but let's look at the early days of open source, where I began my career. Companies that try to profit from other people's contributions have been around since the very beginning of Linux or MySQL. For example, in the Linux field, two companies, Rocky Linux and Alma Linux, have recently promised full compatibility with Red Hat Enterprise Linux (RHEL), but in reality, they do not contribute anything to Red Hat's success. . A natural consequence of the success of these two RHEL clones is that the hosts will disappear and the clones themselves will disappear. That's why in the Linux industry, they are sometimes referred to as the 'masterminds' of open source.
It may be an overstatement, but it gets the point across. This is a criticism in the same vein as the so-called ‘open pit mining’ that was once directed at AWS. The impact of this criticism has sparked several changes to closed source licenses, distortions to business models, and a seemingly endless debate about open source sustainability.
Of course, open source's position is stronger than ever, but the status of individual open source projects varies. Some projects and project maintainers have developed ways to manage 'takers' within the community, but others have not. In any case, the importance and power of open source continues to grow.
depletion of the well
Let’s get back to LLM. Large corporations like JP Morgan Chase are investing billions of dollars and hiring more than 1,000 data scientists and machine learning engineers, reaping huge financial benefits from personalization and analytics. While many companies are hesitant to publicly adopt services like ChatGPT, the reality is that their developers are already using LLM to increase their productivity.
The cost of that effect is now starting to become clear. This is the price that companies like Stack Overflow, which have historically served as a source of productivity improvement, must bear.
For example, according to Similarweb, Stack Overflow traffic has been decreasing by an average of 6% per month since January 2022, and by a whopping 13.9% in March 2023. It's difficult to say for sure that ChatGPT and other generative AI-based tools are entirely responsible for this decline, but that doesn't mean there isn't an impact.
Peter Nixie is the founder of International.io and a top 2% user on Stack Overflow. To date, more than 1.7 million developers have viewed Nixie's responses. Likewise, Nixie, who is famous on Stack Overflow, said, “I will probably never write on Stack Overflow again.” Why? This is because Stack Overflow's knowledge pool is at risk of being depleted by LLMs such as ChatGPT.
Nixie asked, “What would happen if we stopped creating a pool of knowledge together and instead poured it directly into each machine?” The “machine” that Nixie refers to refers to generative AI such as ChatGPT. For example, it would be convenient to get answers from an AI tool such as GitHub's Co-Pilot, which was learned using GitHub repositories and Stack Overflow's Q&A. However, unlike Stack Overflow, Q&A with AI is conducted privately and is therefore not established as a public information repository. Nixie said, “GPT4 was trained using questions posted on Stack Overflow before 2021. “What is GPT6 used to learn?” he asked.