The Law, Policy, & AI Briefing #1: Laws on DRM & scraping evolve, there's now a Russian national code of ethics on AI, Zillow's algorithm creates massive losses, and more...
Your regular briefing on the intersection of law, policy, and artificial intelligence.
Hi all, welcome to the first edition of the Law, Policy, and AI Briefing. Briefings will go out roughly once a week – though probably not every week, because I need to do research also.
Who am I? I’m a PhD (Machine Learning)-JD candidate at Stanford University, you can learn more about my research here.
What is this? The goal of this letter is to round up some interesting bits of information and events somewhere at the intersection of Law, Policy, and AI. Sometimes I will weigh in with thoughts or more in-depth summaries. Feel free to send me things that you think should be highlighted @PeterHndrsn. Also… just in case, none of this is legal advice.
Your first briefing awaits below!
Law
A new paper, “Can I use this publicly available dataset to build commercial AI software? Most likely not” has popped up on arxiv. It analyzes the use of datasets for commercial and non-commercial contexts, and suggests that people aren’t really abiding by licenses/copyright properly! Keep in mind, though, in many countries/governments (e.g., the United States, the European Union, and Japan) there are “fair use” exemptions, allowing text/data mining on copyrighted data for non-commercial research.
The U.S. Copyright office, as part of its triennial rule-making process, has created a rule that limits liability for removing DRM for the purposes of conducting text/data mining for non-commercial research. Prior to this, if you removed DRM for text/data mining, you would probably be liable under the DMCA act. Now, probably not.
In recent oral arguments in the 9th Circuit, LinkedIn argues that bypassing an IP block for the purposes of scraping constitutes a violation of the Computer Fraud and Abuse Act (CFAA) - making it a potentially criminal act. The judges seemed unconvinced. The outcome of this case will determine, among other things, whether you are liable if you use proxies to evade being blocked for scraping a website. For those using proxies to avoid being blocked while scraping together data for a dataset, keep a close eye on this.
For dataset providers located in China, new rules might restrict what data can be transferred out of China. See analysis here. This has obvious implications for data sharing and hosting for the purposes of, say, training large language models.
The state of Michigan has passed a bill that would restrict government agents from using encrypted apps. This is good for transparency, but probably not so great for security. Why is this relevant to AI? Well, if you wanted to discover how governments are using AI, you often have to file a Freedom of Information Act (FOIA) request to get communications between government employees. If they encrypt or use apps like Signal, you would probably never get those messages — and never uncover potentially problematic uses of AI.
The CFPB will take action to prevent false identification by background screening companies. Without regulation, it is highly likely that ML will be commonly used for identity matching in the future — it is already common in biometrics. The CFPBs enforcement decision aims to address bias/unfairness in background screening processes. “The risk of mistaken identities from name-only matching is likely to be greater among Hispanic, Black, and Asian communities because there is less surname diversity in those populations compared to the white population.” Mistaken identities can impact job prospects, credit ratings, and livelihoods — intervention and regulation is important.
Policy and Society
NATO has a new AI strategy. There is some mention of responsible use of AI, but this is a very high-level policy and the details matter a lot here.
A voluntary Russian national code of ethics on artificial intelligence has been put forward by the AI Alliance jointly with the Analytical Center under the Russian government and the Economic Development Ministry. It was signed by “Sberbank, Gazprom Neft, Yandex VK, MTS and the Russian Direct Investment Fund, as well as representatives of Skolkovo, Rostelecom, Rosatom, InfoWatch and real estate platform Cian.” It can be found here.
A job market paper in economics, “Loan Officers, Algorithms, & Credit Outcomes:Experimental Evidence from Pakistan,” examines real loan officer decisions in Pakistan as compared to an algorithm. The paper finds that “loan officers exhibit a gender equity preference and approve more women once they observe gender without raising overall loan default.” Conversely, “while discrimination declines for loan officers, it increases for the algorithm.” The outcome suggests that “blinding algorithms to applicant demographic characteristics may boost efficiency and ensure equity in developing economy credit markets.”
Zillow lays off a large part of its workforce after its pricing algorithm induces staggering losses. This is a cautionary tale of using neural networks without thorough validation procedures in place — and constant re-assessment. I also wonder about its local effects on housing prices.
A new study from CSET examines the number of robotics patents by country. It finds that though Russia has 2% of robotics patents, it has 17% of all robotics patents for military applications. The United States has a similar trend with 13.2% of all robotics patents and 26.2% of military robotics patents. China and Japan have the opposite trend. China has 34.6% of all robotics patents and 24.9% of the military robotics patents. Japan has 20.8% of all robotics patents and 2.2% of military robotics patents.