How Much Open Source Code Will Be AI-Generated?

Introduction

The open-source software ecosystem stands at an inflection point. Across major technology companies, developer communities and enterprise environments, artificial intelligence is fundamentally reshaping how code is written and reviewed. The data emerging from 2025 reveals a transformation accelerating far faster than most anticipated, raising profound questions about the future composition, governance and security of the open-source infrastructure upon which modern digital civilization depends.

The data emerging from 2025 reveals a transformation accelerating far faster than most anticipated, raising profound questions about the future composition, governance and security of the open-source infrastructure upon which modern digital civilization depends.

AI Code Generation Has Already Arrived

The numbers from early 2026 tell a story of rapid adoption that has exceeded industry projections. According to the latest research, approximately 41% of all code written globally is now AI-generated. This figure represents not a distant future scenario but the present reality of software development. GitHub Copilot, the most widely adopted AI coding assistant, now generates an average of 46% of the code written by its users, with Java developers experiencing rates as high as 61%. The enterprise adoption trajectory provides further evidence of this shift. Microsoft CEO Satya Nadella revealed in April 2025 that between 20% and 30% of code in Microsoft’s repositories is entirely AI-generated. Google CEO Sundar Pichai indicated in October 2024 that over 25% of new code at Google originates from AI systems. These aren’t experimental pilot programs – they represent production code shipping to billions of users worldwide. The developer community has embraced these tools with remarkable speed. By mid-2025, 82% of developers reported using AI coding tools either daily or weekly, while Stack Overflow’s 2025 developer survey found that 84% of respondents are using or planning to use AI tools in their development process, with 51% of professional developers using them daily. GitHub Copilot reached 20 million cumulative users by July 2025, marking 5 million new users in just three months. It has been adopted by 90% of Fortune 100 companies.

Exponential Growth Through 2035

Industry forecasts point toward an acceleration of this trend through the coming decade. Microsoft CTO Kevin Scott has predicted that by 2030, AI will generate 95% of all code. While this projection may initially appear hyperbolic, the underlying technological and economic forces suggest it represents a plausible trajectory rather than mere speculation. The AI code assistant market itself reflects this momentum. The global market reached $3.9 billion in 2025 and is projected to grow to $6.6 billion by 2035, though more aggressive forecasts place the market between $20 billion and $30 billion by 2035, expanding at a compound annual growth rate of 18% to 25% through 2030. These figures understate the impact, as they measure only the tools market rather than the percentage of code being generated.

These figures understate the impact, as they measure only the tools market rather than the percentage of code being generated

Anthropic CEO Dario Amodei suggested in mid-2025 that AI would be writing 90% of code within three to six months – a prediction that, while not yet realized, indicates the expectations among leading AI companies. Meta CEO Mark Zuckerberg stated that within a year, approximately half of Meta’s development would be accomplished by AI rather than humans, with that percentage continuing to grow.

Open-Source at the Epicenter of Transformation

The open-source ecosystem has become ground zero for AI-driven code generation. GitHub’s Octoverse 2025 report reveals that more than 1.1 million public repositories now depend on generative AI SDKs, representing a 178% year-over-year increase. Remarkably, 693,000+ of these repositories were created in just the last 12 months, sharply outpacing 2024’s total of approximately 400,000. GitHub now hosts over 630 million total repositories, adding more than 121 million new repositories in 2025 alone.

GitHub now hosts over 630 million total repositories, adding more than 121 million new repositories in 2025 alone

Six of the ten fastest-growing open-source repositories by contributor count in 2025 were AI infrastructure projects. Projects such as vllm, ollama, ragflow, and llama.cpp dominate contributor growth, confirming that the open source community is investing heavily in the foundation layers of AI – model runtimes, inference engines and orchestration frameworks. This creates a self-reinforcing cycle. Open-source developers build AI infrastructure tools, which in turn generate more open source code, which feeds back into training data for future AI models. The scale of AI-related open source activity is unprecedented. GitHub reported 65,000 public generative AI projects created in 2023, marking a 248% year-over-year growth. By 2025, this had accelerated further, with AI-related repositories supported by 1.05 million+ contributors and generating 1.75 million monthly commits i.e. a 4.8-fold increase since 2023. Programming queries accounted for roughly 11% of total token volume to large language models in early 2025 and exceeded 50% in recent weeks, demonstrating that code generation has become the dominant use case for AI systems.

Security and Maintainability Concerns

As AI-generated code proliferates through open source repositories, significant concerns about code quality, security vulnerabilities and long-term maintainability have emerged. Research from multiple sources paints a troubling picture of the security implications.

A comprehensive study by CodeRabbit found that AI-generated code creates 1.7 times more problems than human-written code. The analysis revealed that AI-generated code often omits critical security controls – null checks, early returns, guardrails, comprehensive exception logic – issues directly tied to real-world system outages. Excessive input/output operations were approximately eight times more common in AI-authored pull requests, reflecting AI’s tendency to favor code clarity and simple patterns over resource efficiency.

AI-generated code creates 1.7 times more problems than human-written code

Academic research supports these findings. A study analyzing 58 commonly asked C++ programming questions found that large language models generate vulnerable code regardless of parameter settings, with issues recurring across different question types, such as file handling and memory management. The LLM-CSEC benchmark, which uses 280 real-world prompts that commonly lead to security issues, found that even with explicit “secure code generator” prompting, the median LLM generation contains multiple high-severity vulnerabilities. Every model tested produced code containing critical vulnerabilities, including those linked to well-documented Common Weakness Enumerations (CWEs). The problem stems from training data quality. As systematic literature reviews reveal, AI models are trained on code repositories that are themselves “ripe with vulnerabilities and bad practice”. When AI systems learn from flawed training data, they inevitably reproduce those flaws. A Stanford University study found that software engineers using code-generating AI systems were more likely to cause security vulnerabilities in their applications and, even more concerning, developers were more likely to believe their insecure AI-generated solutions were actually secure compared to control groups.

The problem stems from training data quality

Security leaders have taken notice. A survey of 800 security decision-makers found that 63% have considered banning AI in coding due to security risks, with 92% expressing concerns about AI-generated code in their organizations. The three primary concerns identified were developers becoming over-reliant on AI leading to lower standards, AI-written code not being effectively quality-checked and AI using outdated open-source libraries. Despite these quality concerns – or perhaps because of widespread AI tool usage – only about 30% of AI-generated code suggestions are actually accepted by developers. GitHub Copilot’s code acceptance rate averages between 27% and 30%, though developers retain 88% of accepted code in final submissions, suggesting that while developers are selective, the code they do accept is generally production-ready. However, GitClear’s 2024 analysis of over 153 million lines of code found that AI-assisted coding is linked to four times more code duplication than before. AI may be changing code quality metrics in concerning ways.

The Maintainer Crisis

The proliferation of AI-generated contributions has created an unprecedented burden for open source maintainers, who are predominantly unpaid volunteers. Daniel Stenberg, creator of curl, remarked in 2025 that the project is being “effectively DDoSed” by AI-generated bug reports. Approximately 20% of submissions to curl in 2025 were categorized as AI-generated noise, with the volume at one point surging to eight times the typical amount. Stenberg is now contemplating discontinuing the project’s bug bounty program entirely. This pattern extends across major open-source projects. The maintainers of OCaml rejected a massive 13,000-line pull request generated by AI, reasoning that evaluating AI-produced code is more demanding than assessing human-written code and an influx of low-effort pull requests poses significant risk of overwhelming their review systems. Anthony Fu and others in the Vue ecosystem report being inundated with pull requests from contributors who use AI to respond to “help wanted” issues, then mechanically work through review comments without genuine understanding of the code.

Anthony Fu and others in the Vue ecosystem report being inundated with pull requests from contributors who use AI to respond to “help wanted” issues, then mechanically work through review comments without genuine understanding of the code

The problem is structural. Many contributors, often students seeking to enhance their resumes or bounty hunters chasing rewards, leverage AI to generate large volumes of pull requests and bug reports. While the initial output may appear credible, it frequently falls apart during the review process. Maintainers spend hours sifting through low-quality content, time they cannot devote to legitimate contributions or core development work.GitHub has inadvertently exacerbated the problem by incorporating Copilot into issue and pull request creation, making it impossible to block this feature or identify which submissions originated from AI. The inability to distinguish AI-generated contributions from human ones forces maintainers to evaluate all submissions with equal scrutiny, multiplying their workload precisely when AI tools promise to reduce it. Some maintainers report more nuanced experiences. A maintainer’s perspective from late 2025 notes that “contributors now have access to powerful AI tools, but many maintainers don’t – and without them, maintainers only feel the negatives i.e. more contributions to review, some low-quality, without the means to keep up”. This highlights a critical asymmetry. Contributors are AI-augmented while maintainers often are not, creating a productivity imbalance that threatens the sustainability of open source development.

The Unresolved Legal Landscape

The legal status of AI-generated code in open source contexts remains deeply uncertain, with potentially profound implications for the next decade. Current copyright law in most jurisdictions holds that code generated solely by AI, without substantial human authorship, is not eligible for copyright protection. This creates a paradoxical situation for open source. If AI-generated code cannot be copyrighted, it cannot be properly licensed under traditional open source licenses, which depend on copyright law for their legal force. The risk of license contamination compounds the problem. Many AI models, including GitHub Copilot, are trained on vast repositories of open source code, some of which is governed by strong copyleft licenses such as the GNU General Public License (GPL). While these licenses permit creating derivative works, they require that any program built using GPL-licensed code must itself be released under GPL. There remains a risk that AI tools output code substantially similar or identical to existing copyleft-licensed code. If developers unknowingly incorporate such code into proprietary projects, they could face copyright infringement claims.

Major open-source projects are grappling with how to address AI contributions

Major open-source projects are grappling with how to address AI contributions. The Linux kernel community has developed guidelines for AI-assisted patches, proposed by NVIDIA developer Sasha Levin. The v3 iteration of the proposal emphasizes transparency and accountability, requiring developers to disclose AI involvement through a ‘Co-developed-by’ tag. Linus Torvalds, Linux’s creator, has advocated for treating AI tools no differently than traditional coding aids, seeing no need for special copyright treatment and viewing AI contributions as extensions of the developer’s work.However, not all projects share this pragmatic approach. NetBSD and Gentoo have implemented restrictive policies against AI-generated contributions. The curl project banned AI-generated security reports due to floods of low-quality submissions. The LLVM compiler project adopted a “human in the loop” policy in January 2026, banning code contributions submitted by AI agents without human approval and requiring that contributors using AI assistance review all code and be able to answer questions about it without reference back to the AI. Ongoing litigation will shape the legal landscape. The GitHub Copilot Intellectual Property Litigation, filed in late 2022, alleges that Microsoft and OpenAI profited from open source programmers’ work by violating open-source license conditions. A judge dismissed some claims in summer 2024, reasoning that AI-generated code is not identical to the training data and thus does not violate U.S. copyright law, which generally applies only to identical or near-identical reproductions. The plaintiffs appealed, and as of spring 2025, litigation remains ongoing. The New York Times lawsuit against OpenAI, while focused on text rather than code, could have significant implications. If courts rule that output generated by AI models trained on certain data qualifies as reuse of that data, it would support claims that generative AI violates open source software licenses when trained on and reproducing open source code.

The Open Source Initiative (OSI) has recognized that traditional open source definitions are insufficient for AI systems.

The Open Source Initiative (OSI) has recognized that traditional open source definitions are insufficient for AI systems. Their Open Source AI Definition (OSAID) requires that the preferred form for making modifications to machine learning systems must include data information (detailed information about training data), complete source code used to train and run the system, and parameters (weights refined during training). However, the list of AI models validated as complying with OSAID remains relatively short, including only Pythia, OLMo, Amber, CrystalCoder, and T5.

A Self-Consuming Ecosystem?

A particularly concerning phenomenon threatens the long-term quality of AI-generated code i.e. model collapse. This occurs when machine learning models gradually degrade due to errors from uncurated training on outputs of other models, including prior versions of themselves. As Shumailov and colleagues who coined the term describe, model collapse progresses through two stages:

early model collapse, where the model begins losing information about minority data in distribution tails
late model collapse, where the model loses significant performance, confusing concepts and losing most variance.

The mechanism is straightforward but insidious. As AI-generated data proliferates on the internet, it inevitably ends up in future training datasets, which are often crawled from public sources. If AI models are trained on large quantities of unlabeled synthetic data – what researchers call “slop” – without proper curation, model collapse becomes increasingly likely. For open source code repositories, which are primary sources of training data for AI coding assistants, this creates a feedback loop. AI generates code, that code is committed to repositories, those repositories are scraped to train the next generation of AI models, which then generate even more degraded code. Recent research offers both warnings and potential solutions. Studies show that if synthetic data accumulates alongside human-generated data rather than replacing it, model collapse can be avoided. Verification of synthetic data by humans or superior models can prevent collapse and even drive improvement in the short term, though long-term iterative retraining eventually drives parameters toward the verifier’s “knowledge center” rather than ground truth. Importantly, research demonstrates that even small proportions of synthetic data can harm performance if not properly curated. For open-source repositories through 2035, this suggests that the proportion of AI-generated code matters less than the curation and verification processes surrounding it. Repositories that maintain strong human review processes and preserve historical human-written code alongside new AI contributions may avoid quality degradation. Those that accept uncritical floods of AI-generated pull requests risk becoming training data that progressively degrades future AI models, creating a vicious cycle.

Open Source Code Composition in 2035

Based on current trajectories and underlying technological trends, several scenarios emerge for the composition of open source code by 2035:

The Conservative Scenario (40 to 60% AI-Generated). If quality concerns, legal uncertainties, and maintainer resistance successfully temper adoption, AI-generated code might stabilize at 40-60% of new contributions by 2035. This scenario assumes that the security vulnerabilities and code quality issues currently observed drive increased scrutiny and selective adoption, with AI tools primarily used for boilerplate code, documentation, and test generation rather than core logic. Major projects implement strict human-in-the-loop requirements similar to LLVM’s policy, and legal frameworks clarify that AI-generated code requires substantial human modification to be copyrightable and properly licensed.
The Moderate Scenario (60 to 80% AI-Generated). This represents the most likely trajectory based on current enterprise adoption rates and market forecasts. By 2035, AI coding assistants have become as ubiquitous as integrated development environments, generating 60-80% of initial code. However, human developers retain essential roles in architecture, security review, and complex problem-solving. Tools have improved significantly, with better context awareness and fewer security vulnerabilities. Legal frameworks have adapted, and open source licenses have been updated to accommodate AI-generated contributions. Verification tools powered by AI help maintainers handle higher contribution volumes. This scenario aligns with predictions from industry leaders like Kevin Scott and Satya Nadella but accounts for the friction and quality concerns that will inevitably moderate pure adoption curves.
The Transformative Scenario (80 to 95% AI-Generated). In this scenario, which assumes continued exponential improvement in AI capabilities and the emergence of true AI software engineering agents, AI generates 80-95% of code by 2035. Developers function primarily as system architects, prompt engineers, and verifiers, with AI handling not just code generation but also testing, debugging, documentation, and even initial code review. The definition of “contributor” expands dramatically to include non-programmers who can describe desired functionality in natural language. Open source repositories implement AI maintainer assistants that handle triage, initial review, and routine maintenance. This scenario requires resolution of current security and quality issues through better AI models, improved training data curation, and sophisticated verification systems.
The Bifurcated Scenario: Rather than a uniform shift, the open source ecosystem splits along quality and criticality lines. Infrastructure-critical projects like the Linux kernel, cryptographic libraries, and core language runtimes maintain strict limits on AI-generated code, perhaps 20 to 40%, with extensive human review and formal verification requirements. Meanwhile, application-layer projects, developer tools, and experimental repositories embrace AI generation at rates approaching 90 to 95%. This creates a two-tier ecosystem where foundational projects remain primarily human-authored while the vast majority of code volume is AI-generated.

The most probable outcome by 2035 combines elements of the moderate and bifurcated scenarios: overall AI generation reaches 60-75% across all open source code, but with significant variance based on project criticality, domain, and maturity. Mature, security-critical projects maintain 40-50% AI generation with rigorous review, while newer, experimental, and application-layer projects approach 85-90% AI generation.

The Changing Nature of Contribution and Development

The fundamental nature of software development and open source contribution is transforming alongside code generation percentages. By 2035, the role of software engineer will have evolved from code writer to what industry analysts describe as “system composer,” “AI orchestrator,” or “value engineer”.

By 2035, the role of software engineer will have evolved from code writer to what industry analysts describe as “system composer,” “AI orchestrator,” or “value engineer”

Developers will spend significantly less time on syntax and implementation details and more time on higher-order activities: defining system architecture, establishing guardrails and constraints for AI code generation, conducting security and logic reviews, integrating components and making strategic technical decisions. The most valuable engineers will not be those who code fastest, but those who can ask the right questions of AI systems, critically evaluate generated code and understand both technical implementation and business domain requirements. New specializations will emerge. “AI Risk Engineers” and “Security-Orchestration Engineers” will focus on ensuring AI-generated systems meet security and compliance requirements. “Prompt Engineers” will craft the instructions that guide AI code generation. “Trust Engineers” will establish governance frameworks and accountability measures for AI-assisted development. “Human-Machine Teaming Managers” will optimize collaboration between human developers and AI agents. For open-source specifically, the contributor demographic will expand dramatically. Natural language interfaces to code generation will lower barriers to entry, enabling domain experts without traditional programming skills to contribute meaningful functionality. This democratization could revitalize unmaintained projects and bring fresh perspectives to established ones. However, it also risks overwhelming maintainers with contributions from people who lack deep understanding of software engineering principles, exacerbating current challenges. The economics of open-source maintenance will require reconsideration. If AI companies derive significant value from open source repositories as both training data and deployment targets, calls for these companies to sponsor maintainers and provide them with access to premium AI tools will likely intensify. Some argue that providing maintainers with the same AI assistance available to contributors represents both pragmatic necessity and ethical obligation.

Strategic Implications and Recommendations

For open-source projects and the broader developer community, several strategic considerations emerge:

Develop AI Governance Frameworks Now: Projects should establish clear policies regarding AI-generated contributions before they become overwhelming. The Linux kernel’s approach – requiring transparency through tags, maintaining human accountability, and emphasizing that developers must understand and be able to explain code regardless of how it was generated – provides a reasonable template. Projects should decide early whether to embrace, limit, or segregate AI contributions based on their specific security and quality requirements.
Invest in Verification Infrastructure: The quality gap between AI-generated and human-written code demands enhanced verification. This includes expanding automated testing, implementing AI-powered code review tools that can detect common AI-generated vulnerabilities, establishing security-focused static analysis in continuous integration pipelines, and maintaining strict manual review requirements for security-critical components. Some projects may benefit from AI maintainer assistants that provide initial triage while human maintainers focus on substantive review.
Address the Training Data Challenge: Open source communities should engage with AI companies to ensure training data is ethically sourced, properly attributed, and curated for quality. Projects might consider explicit licensing terms that address AI training usage, similar to how Creative Commons licenses evolved to address different use cases. The OSI’s work on Open Source AI Definition represents important progress, but widespread adoption requires clearer guidelines and enforcement mechanisms.
Preserve Human-Written Code. Given model collapse risks, open source repositories should maintain clear provenance tracking that distinguishes human-written code from AI-generated contributions. Historical human-written code represents increasingly valuable training data and should be preserved, documented, and potentially maintained separately to prevent contamination by lower-quality AI-generated code. Version control systems might evolve to include AI generation metadata as a first-class feature.
Strengthen Maintainer Support: The asymmetry between AI-augmented contributors and non-augmented maintainers threatens open source sustainability. Foundations and sponsors should provide maintainers with access to premium AI coding and review tools, fund maintainer positions rather than relying solely on volunteers, develop AI-powered triage and moderation tools designed specifically for maintainer workflows, and create cross-project reputation systems that help maintainers identify high-quality versus low-effort contributors.
Embrace Hybrid Development Models: The most successful approach likely involves treating AI as a productivity multiplier rather than a replacement for human judgment. Organizations should use AI for routine tasks including boilerplate code, test generation, documentation, and initial implementation, while maintaining human oversight for architecture, security review, business logic, and complex problem-solving. Research shows that teams treating AI as a process challenge rather than merely a technology challenge achieve significantly better outcomes
Invest in Developer Skills Evolution: As AI handles more implementation details, developers must cultivate complementary skills: advanced system design and architecture, security and vulnerability assessment, domain expertise in specific industries or applications, prompt engineering and AI interaction, critical evaluation of AI-generated outputs, and understanding of AI limitations and failure modes. Educational institutions and companies should redesign training programs to emphasize these higher-order skills rather than syntax memorization.

Conclusion

The question is not whether substantial portions of open-source code will be AI-generated by 2035, but rather how the ecosystem will adapt to this transformation while preserving the qualities that made open-source successful i.e. code quality, security, collaborative innovation and knowledge sharing. Current data suggests that by 2035, AI will likely generate between 60% and 80% of new open-source code contributions, with significant variance based on project type, domain and governance choices. This represents a fundamental shift in software development, comparable to the transitions from assembly to high-level languages or from procedural to object-oriented programming. However, unlike those previous transitions, this one occurs on a compressed timeline and raises novel questions about authorship, accountability, legal liability, and the very nature of contribution. The path forward requires neither uncritical embrace nor reactionary rejection of AI code generation. Instead, it demands thoughtful governance, rigorous verification, investment in maintainer support, evolution of legal frameworks, and recognition that while AI can generate code, human judgment remains essential for determining what code should be generated, how it integrates into broader systems, and whether it truly solves the problems at hand. Open source has weathered previous existential challenges – from proprietary software dominance to patent threats to security vulnerabilities. The AI code generation transition may prove the most profound yet, but the principles that sustained open source through previous challenges remain relevant: transparency, collaboration, peer review, and the collective wisdom of the developer community. By applying these principles to AI-generated contributions – insisting on transparency about generation methods, collaborative review processes, rigorous peer evaluation, and collective standards for quality – the open source ecosystem can harness AI’s productivity benefits while mitigating its risks. The open source code of 2035 will likely be a hybrid creation: AI-generated in its implementation details but human-guided in its architecture, human-verified in its security properties, human-maintained in its evolution, and ultimately human-accountable in its impacts on society. The challenge for the next decade lies in building the governance structures, verification tools, legal frameworks, and community practices that make this hybrid model sustainable, secure, and true to open source principles.

References

Elite Brains. (2025). AI-Generated Code Statistics 2025: Is Your Developer Job Safe?[elitebrains]

CNBC. (2025). Satya Nadella says as much as 30% of Microsoft code is written by AI.[cnbc]

Quantum Run. (2026). GitHub Copilot Statistics 2026.[quantumrun]

Netcorp Software Development. (2026). AI-Generated Code Statistics 2026: Can AI Replace Your Developer?[netcorpsoftwaredevelopment]

Reddit. (2024). What percentage of code is now written by AI?[reddit]

Opsera. (2025). Github Copilot Adoption Trends: Insights from Real Data.[opsera]

Panto AI. (2026). AI Coding Assistant Statistics and Global Trends for 2026.[getpanto]

Second Talent. (2025). AI Coding Assistant Statistics & Trends .[secondtalent]

arXiv. (2025). Experience with GitHub Copilot for Developer Productivity at Zoominfo.[arxiv]

Master of Code. (2026). 350+ Generative AI Statistics [January 2026].[masterofcode]

Reddit. (2024). What percent of code is now written by AI?[reddit]

Tenet. (2025). Github Copilot Usage Data Statistics For 2026.[wearetenet]

MIT Technology Review. (2025). AI coding is now everywhere. But not everyone is convinced.[technologyreview]

Reddit. (2025). Anthropic CEO: AI Will Be Writing 90% of Code in 3 to 6 Months.[reddit]

Second Talent. (2025). GitHub Copilot Statistics & Adoption Trends .[secondtalent]

OpenRouter. (2024). State of AI 2025: 100T Token LLM Usage Study.[openrouter]

GitHub Blog. (2025). Octoverse: A new developer joins GitHub every second as AI leads TypeScript to #1.[github]

Abeta Automation. (2025). AI Will Write 95% of Code by 2030.[abetaautomation]

LinkedIn. (2025). Top 04 Open-Source Generative AI Models of 2025.[linkedin]

arXiv. (2024). The Impact of Generative AI on Collaborative Open-Source Software.[arxiv]

Yuma AI. (2026). 7 Bold AI Predictions for 2035.[yuma]

OS-SCI. (2025). Open vs. Closed: The State of AI Code Creation Platforms in 2025.[os-sci]

OpenSSF. (2025). AI, State Actors, and Supply Chains.[openssf]

LinkedIn. (2025). AI will replace 95% of coding by 2030, predicts Microsoft CTO.[linkedin]

Red Hat. (2026). The state of open source AI models in 2025.[developers.redhat]

METR. (2025). Measuring the Impact of Early-2025 AI on Experienced Open Source Developers.[metr]

Epoch AI. (2025). What will AI look like in 2030?[epoch]

Duck Alignment Academy. (2025). Open source trends 2025.[duckalignment]

Hacker News. (2026). If AI is so good at coding where are the open source contributions?[news.ycombinator]

Sundeep Teki. (2025). AI & Your Career: Charting Your Success from 2025 to 2035.[sundeepteki]

Grøn. (2025). The Code Quality Conundrum: Why Open Source Should Embrace Critical Evaluation of AI-generated Contributions.[xn--grn-sna]

Reddit. (2026). Open source is being DDoSed by AI slop and GitHub is making it worse.[reddit]

st0012.dev. (2025). AI and Open Source: A Maintainer’s Take (End of 2025).[st0012]

Sonar Source. (2023). AI Code Generation Benefits & Risks.[sonarsource]

Graphite. (2025). Best AI pull request reviewers in 2025.[graphite]

Reddit. (2025). Open Source Maintainers – Tell me about your struggles.[reddit]

CodeRabbit. (2025). Our new report: AI code creates 1.7x more problems.[coderabbit]

Reddit. (2023). AI-generated spam pull requests?[reddit]

Wagtail. (2023). Open source maintenance, new contributors, and AI agents.[wagtail]

arXiv. (2025). Assessing the Quality and Security of AI-Generated Code.[arxiv]

Reddit. (2024). I built an AI maintainer for open-source GitHub repositories.[reddit]

SecureFlag. (2024). The risks of generative AI coding in software development.[blog.secureflag]

Dev.to. (2025). The 6 Best AI Code Review Tools for Pull Requests in 2025.[dev]

Continue.dev. (2026). Why unowned AI contributions are breaking open source.[blog.continue]

DX. (2025). AI code generation: Best practices for enterprise adoption.[getdx]

Future Market Insights. (2025). AI Code Assistant Market Global Market Analysis Report.[futuremarketinsights]

LinkedIn. (2025). AI coding tools reshape development teams, says KeyBank CIO.[linkedin]

Menlo Ventures. (2026). 2025: The State of Generative AI in the Enterprise.[menlovc]

Grand View Research. (2023). Generative AI Coding Assistants Market Size Report, 2030.[grandviewresearch]

Shift Asia. (2025). How AI Coding Tools Help Boost Productivity for Developers.[shiftasia]

Glean. (2025). Top 10 trends in AI adoption for enterprises in 2025.[glean]

[survey.stackoverflow] Stack Overflow. (2025). AI | 2025 Stack Overflow Developer Survey.

HD Insight Research. (2025). AI Code Assistants Market Insights 2025.[hdinresearch]

Markets and Markets. (2025). AI Assistant Market worth $21.11 billion by 2030.[marketsandmarkets]

Pragmatic Coders. (2026). Best AI Tools for Coding in 2026.[pragmaticcoders]

arXiv. (2025). Synthetic Data Generation Using Large Language Models.[arxiv]

Reddit. (2024). Evidence that training models on AI-created data degrades their quality.[reddit]

LIACS. (2025). Security Vulnerabilities in LLM-Generated Code.[theses.liacs]

Neptune AI. (2025). Synthetic Data for LLM Training.[neptune]

LakeFS. (2025). Why Data Quality Is Key For ML Model Development & Training.[lakefs]

arXiv. (2024). LLM-CSEC: Empirical Evaluation of Security in C/C++ Code.[arxiv]

ACL Anthology. (2025). Case2Code: Scalable Synthetic Data for Code Generation.[aclanthology]

PromptCloud. (2025). AI Training Data: How to Source, Prepare & Optimize It.[promptcloud]

GB Hackers. (2025). New Research and PoC Reveal Security Risks in LLM-Generated Code.[gbhackers]

OpenAI Cookbook. (2025). Synthetic data generation (Part 1).[cookbook.openai]

Emergent Mind. (2025). LLM-Generated Code Security.[emergentmind]

Confident AI. (2025). Using LLMs for Synthetic Data Generation: The Definitive Guide.[confident-ai]

Sonar Source. (2025). OWASP LLM Top 10: How it Applies to Code Generation.[sonarsource]

Hedman Legal. (2024). Copyright and privacy implications of using artificial intelligence to generate code.[hedman]

Slashdot. (2025). How Should the Linux Kernel Handle AI-Generated Contributions.[linux.slashdot]

TechTarget. (2025). Does AI-generated code violate open source licenses?[techtarget]

WebProNews. (2025). Linux Kernel’s AI Code Revolution: Guidelines for the Machine Age.[webpronews]

Aera IP. (2024). ai matters: open source and generative ai.[aera-ip]

Red Hat. (2025). AI-assisted development and open source: legal issues.[redhat]

DevClass. (2026). LLVM project adopts “human in the loop” policy following AI-driven nuisance contributions.[devclass]

Hunton. (2025). Part 1 – Open Source AI Models: How Open Are They Really.[hunton]

Eurekasoft. (2025). Ai-generated Code and Copyright: Who owns Ai-written software.[eurekasoft]

ZDNet. (2025). AI is creeping into the Linux kernel – and official policy is needed asap.[zdnet]

Reddit. (2025). Linux Kernel Proposal Documents Rules For Using AI.

It’s FOSS. (2025). GitHub’s 2025 Report Reveals Some Surprising Developer Trends.[itsfoss]

Salsa Digital. (2024). The state of AI and open source — the Octoverse report.[salsa]

Wikipedia. (2024). Model collapse.[en.wikipedia]

arXiv. (2025). Escaping Model Collapse via Synthetic Data Verification.[arxiv]

Reddit. (2024). Researcher shows Model Collapse easily avoided by keeping old human data.[reddit]

GitHub Blog. (2025). Octoverse 2024.

Nature. (2024). AI models collapse when trained on recursively generated data.

LinkedIn. (2025). How Software Engineering Will Change by 2035.[linkedin]

Morgan Stanley. (2025). How AI Coding Is Creating Jobs.[morganstanley]

GitHub Resources. (2025). The executive’s guide: How engineering teams are balancing AI and human oversight.[resources.github]

LinkedIn. (2025). The Future of Software Development (2025–2030).[linkedin]

Forbes. (2024). How Generative AI Will Change The Jobs Of Computer Programmers And Software Engineers.[forbes]

Aikido. (2025). Using AI for Code Review: What It Can (and Can’t) Do Today.[aikido]

Reddit. (2025). AI will “reinvent” developers, not replace them, says GitHub CEO.

GitHub Blog. (2025). The developer role is evolving. Here’s how to stay ahead.

World Economic Forum. (2025). Top 10 Jobs of the Future – For (2030) And Beyond.[weforum]

Brainhub. (2025). Is There a Future for Software Engineers? The Impact of AI.

How Much Open Source Code Will Be AI-Generated?

Introduction

AI Code Generation Has Already Arrived

Exponential Growth Through 2035

Open-Source at the Epicenter of Transformation

Security and Maintainability Concerns

The Maintainer Crisis

The Unresolved Legal Landscape

A Self-Consuming Ecosystem?

Open Source Code Composition in 2035

The Changing Nature of Contribution and Development

Strategic Implications and Recommendations

Conclusion

References

Leave a Reply

Leave a Reply Cancel reply

Small Business

Business

Business Plus

Enterprise

Small Business

Business

Business Plus

Enterprise

Small Business

Business

Business Plus

Enterprise

Small Business

Business

Business Plus

Enterprise

Small Business

Business

Business Plus

Enterprise

Small Business

Business

Business Plus

Enterprise