{"id":13879,"date":"2026-01-03T04:14:30","date_gmt":"2026-01-03T04:14:30","guid":{"rendered":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/?p=13879"},"modified":"2026-01-03T04:18:25","modified_gmt":"2026-01-03T04:18:25","slug":"between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright","status":"publish","type":"post","link":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/","title":{"rendered":"Between Innovation and Infringement: DPIIT\u2019s Dilemma on AI Training and Copyright"},"content":{"rendered":"\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Background\"><\/span>Background<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Department for Promotion of Industry and Internal Trade (DPIIT) which functions under the Ministry of Commerce and Industry, has published a working paper on the use of copyrighted material as input for AI training. The committee was tasked with assessing whether the current legal framework on copyright sufficiently addresses the issues raised by Gen AI models or whether amendment is needed. The policy examination becomes significant when we examine how the AI models are actually trained.<\/p><div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #0c0c0c;color:#0c0c0c\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #0c0c0c;color:#0c0c0c\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#Background\" >Background<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#AI_Training_and_Text_Data_Mining_TDM\" >AI Training and Text Data Mining (TDM)<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#The_Hybrid_Model\" >The Hybrid Model<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#Copyright_Royalties_Collective_for_AI_Training_CRCAT\" >Copyright Royalties Collective for AI Training (CRCAT)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#Membership_of_CRCAT\" >Membership of CRCAT<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#Royalty_Setting_and_Distribution\" >Royalty Setting and Distribution<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#Structural_Flaws_in_the_Proposed_Hybrid_Model_of_Copyright_Remuneration\" >Structural Flaws in the Proposed Hybrid Model of Copyright Remuneration<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#Conclusion\" >Conclusion<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.legalserviceindia.com\/Legal-Articles\/between-innovation-and-infringement-dpiits-dilemma-on-ai-training-and-copyright\/#References\" >References<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n\n\n\n\n<p>The Large Language AI models like Open AI\u2019s ChatGPT, Claude, Gemini, etc. are trained on copyrighted data using Text Data Mining (TDM) technique. These are techniques used to automatically extract meaningful text, information, pattern, and insights from large volumes of unstructured data.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"AI_Training_and_Text_Data_Mining_TDM\"><\/span>AI Training and Text Data Mining (TDM)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Aspect<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td>Training Input<\/td><td>Large volumes of copyrighted and non-copyrighted textual data<\/td><\/tr><tr><td>Technique Used<\/td><td>Text Data Mining (TDM)<\/td><\/tr><tr><td>Purpose<\/td><td>Extraction of patterns, information, and insights from unstructured data<\/td><\/tr><tr><td>Examples of Models<\/td><td>ChatGPT, Claude, Gemini<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Even data protecting techniques like <strong>Technological Protection Measures (TPMs)<\/strong> are ineffective against TDM as TPMs control actions like viewing, copying, downloading, or printing. Measure like Data Rights Management (DRM) techniques used on Netflix, Amazon Kindle so the user won\u2019t be able to pirate the content. As the AI model isn\u2019t pirating or storing the data but learning the data in the real time. This technological gap shifts question from access control to whether such learning itself amount to copyright infringement.<\/p>\n\n\n\n<p>Copyright violation of the authors and creators of literary works. These models are trained on expressions and not the idea, and copyright protects expression. These gen AI models can generate content that can replace books, articles, etc. Concerning this, DPIIT proposed a solution, <strong>\u201cHybrid Model.\u201d<\/strong><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Hybrid_Model\"><\/span>The Hybrid Model<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In the working paper, they have proposed a <strong>\u201cHybrid Model.\u201d<\/strong> In this model, a statutory blanket licensing with remuneration right will be imposed, Creators will not be able to withhold their work for use in the training AI system.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Copyright_Royalties_Collective_for_AI_Training_CRCAT\"><\/span>Copyright Royalties Collective for AI Training (CRCAT)<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Copyright Royalties Collective for AI Training (CRCAT), a non-profit entity will be created by associations of rightsholders, Collective Management Organization (CMOs) and will be designated by the central government under Copyrights Act, 1957 to collect the royalties and further distribution to authors and creators.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Membership_of_CRCAT\"><\/span>Membership of CRCAT<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><strong>Membership of CRCAT<\/strong><br>Collective Management Organization (CMOs) formed by rightsholders and copyright societies will be members of CRCAT. CRCAT will be the governing body to safeguard collection administration and distribution of royalties. A committee formed by the central government \u201cRate Setting Authority\u201d will determine royalties.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Royalty_Setting_and_Distribution\"><\/span>Royalty Setting and Distribution<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><strong>Royalty setting and distribution.<\/strong><br>Royalties are decided by the Rate Setting Committee, where it consists of senior government officers, senior legal experts, financial or economic experts, and technical experts, member from CRCAT and a representative of AI Developers Distribution is based on a flat rate model at the time. Where a certain percentage of gross global revenue earned by AI developers from the AI system. And the payment will be made after generating revenue from the AI model. And it is payable annually.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Structural_Flaws_in_the_Proposed_Hybrid_Model_of_Copyright_Remuneration\"><\/span>Structural Flaws in the Proposed Hybrid Model of Copyright Remuneration<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Mandatory Licensing Without Opt-Out Weakens the Property of the Nature of Copyright<\/strong><br>Copyright is an exclusive right in nature, but with the hybrid model, it becomes statutory entitlement to uncertain remuneration. Right holders cannot refuse use of their work; consent is replaced with compulsion and mandatory inclusion. And compensation is delayed. This all together weakens the value of Copyrighted material in nature.<\/li>\n\n\n\n<li><strong>Royalty Distribution Inequality<\/strong><br>CRCAT will distribute the royalty through CMOs and Copyright societies. The critical flaw is the valuation process where a single investigative report by a small outlet will be diluted by the large media houses with a massive archive. The immense size of archives of the large media houses will benefit them disproportionately leading to undermining the stated goal of protection of small creators.<\/li>\n\n\n\n<li><strong>Structural Flaws in Revenue Attribution System<\/strong><br>The royalty system works best for traceable use of the copyrighted material. AI Model developer funds the Royalty pool by the flat fee. But the question is how will the royalty be allocated to the specific copyrighted work, as AI is trained on billions of heterogeneous data. The developers are not obliged to disclose the full data training sources because of trade secrets. Therefore, royalty distribution is statistical guess work and not rights-based compensation. And ultimately royalties become a tax on AI companies for using the work rather than compensation to the authors.<\/li>\n\n\n\n<li><strong>Absence of Detailed Usage Monitoring<\/strong><br>The Hybrid model explicitly avoids the dataset level transparency because of innovation and trade secrets. The rightsholders have no way to verify how extensively the work was used. For example, if two works were used to train the model and work \u201cA\u201d is used 10 times more than work \u201cB\u201d, there is no way to verify and creators cannot challenge underpayment because they cannot verify the usage. And that leads to underpaying creators.<\/li>\n\n\n\n<li><strong>Governance and Risk of Institutional Capture<\/strong><br>The CRCAT, a nominally centralized non-profit of designated copyright societies and CMOs under government oversight, faces a significant risk of institutional capture. Since CMOs largely represent major publishers and music labels, their likely dominance within CRCAT&#8217;s internal committees could lead to undue influence. This control would allow them to unilaterally set vital operational parameters, such as royalty allocation and administrative deduction rates.<\/li>\n\n\n\n<li><strong>Incentive Misalignment<\/strong><br>Copyrights basically reward the work, its originality, investment, and efforts. But to the contrary, the Hybrid Model rewards the size of archive, historical dominance and registration volume and not the quality. In the long-term quality of the work will degrade as royalties are detached from actual value creation.<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>The working paper is accepting the responses and it\u2019s not the Hybrid model won\u2019t be the final mechanism. The Hybrid model does solve the lawful data access problem of AI developers, by introducing blanket licensing, but it undermines the core principles of copyright i.e. exclusivity to use the work, reproduction and moral rights. A redesign, considering all the issues can make this model work. A reasonable opt-out mechanism which should narrow the scope for AI bias and preserve the value of the copyrighted work, meanwhile full disclosure of the data sets so that creators and authors would be fairly compensated and this will preserve moral rights, creator\u2019s dignity. This will also help the royalty model to work at its most efficient form rather than disproportionately benefiting the sheer mass and huge size of archives. The flat rate system fails to account for global AI market and legal disputes around the globe. This gets developer exposed to double payment for the data sets.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"References\"><\/span>References<span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<figure class=\"wp-block-embed\"><div class=\"wp-block-embed__wrapper\">\nhttps:\/\/www.dpiit.gov.in\/static\/uploads\/2025\/12\/ff266bbeed10c48e3479c941484f3525.pdf\n<\/div><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This article attempts to critically analyse the working paper published by Department for Promotion of Industry and Internal Trade. It proposes a hybrid model in the working paper, which applies the mandatory blanket licensing. The article critically analyzes this model and their after effects.<\/p>\n","protected":false},"author":953,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"two_page_speed":[],"_jetpack_memberships_contains_paid_content":false,"_joinchat":[],"footnotes":""},"categories":[21],"tags":[28],"class_list":{"0":"post-13879","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-intellectual-property","7":"tag-top-news"},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/posts\/13879","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/users\/953"}],"replies":[{"embeddable":true,"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/comments?post=13879"}],"version-history":[{"count":0,"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/posts\/13879\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/media?parent=13879"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/categories?post=13879"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.legalserviceindia.com\/Legal-Articles\/wp-json\/wp\/v2\/tags?post=13879"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}