[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"$fuyoulw6UxqDRTqpc8cdXhe5oIaL4MJr1yWupAsf52zE":3},{"tableOfContents":4,"markDownContent":5,"htmlContent":6,"metaTitle":7,"metaDescription":7,"wordCount":8,"readTime":9,"title":10,"nbDownloads":11,"excerpt":7,"lang":12,"url":13,"intro":14,"featured":15,"state":16,"author":17,"authorId":18,"datePublication":22,"dateCreation":23,"dateUpdate":24,"mainCategory":25,"categories":41,"metaDatas":47,"imageUrl":48,"imageThumbUrls":49,"id":57},true,"## What is scraping?\r\n \r\nData scraping is a computer technique whereby a programme extracts information from human-readable computer sources.\r\n \r\n**Data scraping** makes it possible to extract and structure data from unorganised sources, from which the information may be difficult to understand and extract.\r\n \r\nThere are many forms of scraping (parsing, report mining, screen scraping, etc.), but one of the most interesting is **web scraping**. It involves **extracting information from web pages**, usually in an **automated** way, using **dedicated software.**\r\n \r\n## How is scraping used?\r\n \r\nThe technique can be used for **mass resale**, by companies specialising in the collection and resale of data, or for **personal commercial canvassing** by companies collecting data on their own behalf.\r\n \r\nMost often, the data collected is **identification or contact data**, such as first and last name, telephone number, e-mail or home address, etc.\r\n \r\n## Is web scraping legal outside Europe?\r\n \r\nThe massive use of web scraping by the **internal services of social networks** and the need for companies to use this data to **achieve their commercial objectives** have recently led the American courts to rule in favour of the use of such tools.\r\n \r\nThe principle under US law is that companies can carry out **any operation** involving data as long as **the law does not prohibit it**. A decision [***LinkedIn v HIQ of 18 April 2022***](https://cdn.ca9.uscourts.gov/datastore/opinions/2022/04/18/17-16783.pdf) gives precedence to **the right to conduct business* over **the protection of users' privacy**. The Court considers that people who choose to post personal information on their public profiles cannot claim that their personal data should remain private and unused, through tools such as scraping, for example.\r\n \r\nThe approach is reversed in Europe, where the data controller must rely on an appropriate **[legal basis](https://www.dastra.eu/en/guide/legal-basis-for-processing/56301)** before any use is made of the data. Otherwise, data processing will be **in principle unlawful**.\r\n \r\n## In Europe, doesn't the greater protection afforded to personal data by the RGPD preclude such uses?\r\n \r\nData on platforms or social networks is very often **public**, as no identification is required to consult it. However, it is still [**personal data**](https://www.dastra.eu/en/guide/personal-data/56315). As a result, the processing of such data falls under the scope of the **GDPR** when it applies and the users of such tools become [**processing controllers**](https://www.dastra.eu/en/guide/what-are-my-obligations-as-a-data-controller/56294).\r\n \r\nA number of its articles are likely to be **violated by such a practice**.\r\n \r\n## As a data controller subject to the GDPR, can we still use scraping on platforms such as LinkedIn? In what context?\r\n \r\nWeb scraping is a tool governed by the GDPR, and its use is subject to 3 main conditions:\r\n \r\n#### 1) You must not violate the general terms and conditions of use of the platform from which the data originates.\r\n \r\nThe first difficulty lies in the platform's **terms and conditions of use**. LinkedIn's T&amp;Cs, for example, state: \"*You agree not to develop, support or use software, devices, scripts, robots or any other means or process designed to perform web scraping of the Services or otherwise copy profiles and other data from the Services*\".\r\n \r\nThus, using scraping on LinkedIn **exposes the user to penalties**, especially as the platform has implemented powerful algorithms to detect these tools.\r\n \r\n#### 2) Complying with GDPR obligations regarding direct commercial prospecting\r\n \r\nIn terms of **commercial canvassing of individuals**, the principle is the **prohibition of direct canvassing** in the absence of **information** and the **obtaining of prior consent** from the individual. The only possible legal basis for commercial prospecting is therefore the consent of individuals.\r\n \r\nThe only exception to this principle is when the individual, in the context of the platform, can reasonably expect their data to be **re-used** for this purpose.\r\n \r\nOn the subject of people's expectations and the use of scraping on LinkedIn, a [**CNIL deliberation of 8 December 2020**](https://www.legifrance.gouv.fr/cnil/id/CNILTEXT000042848036) is particularly enlightening.\r\n \r\nNestor, a company selling meals in the workplace, used a **web scraping** tool to build up a **customer database via the LinkedIn network**. It invoked the legal basis of its **legitimate interest** in prospecting professionals in order to build up and use this database. In fact, **professional-to-professional (B2B) canvassing** on subjects **related to their business** can be carried out **without their consent**.\r\n \r\nHowever, the CNIL first considered that \"*the canvassing messages for the sale of meals at people's place of work **have little connection with the professional activity of the prospects***\", and then ruled that there had been a breach of the GDPR in that the company had failed to fulfil its obligations to **inform** and **receive consent**. The company was unable to rely on its legitimate interest in carrying out this [**data processing**](https://www.dastra.eu/en/guide/data-processing-activity/56354).\r\n \r\n#### 3) Comply with the main principles of the GDPR applicable to all data processing.\r\n \r\nThese include the **failure to inform individuals**, the **failure to consent**, or the **failure to respect individuals' right to object**, in particular when individuals have already objected to any re-use for canvassing purposes.\r\n \r\nIn addition, if the company uses a service provider to supply the tool, it must ensure that the **obligations of Chapter IV of the GDPR on subcontracting** are met.\r\n \r\nLastly, it may be compulsory in certain cases to **carry out an [PIA / DPIA](https://www.dastra.eu/en/guide/data-protection-impact-assessment-pia-or-dpia/56278)**. Even if it is not compulsory, given the characteristics of such processing, it is **recommended that one be carried out in any event**.\r\n\r\n> Dastra can help you comply with the GDPR with a simple and effective solution: **[ask us for a demo](https://www.dastra.eu/en/contacts/demo)**.\r\n\r\n## What are the penalties for non-compliance with these obligations?\r\n \r\n**On the basis of the GDPR**, it is possible to be condemned for violating **articles 5, 12 and 13** of the GDPR (**principles of processing** and **rights of individuals**). **Article 83** of the GDPR provides for an **administrative fine** of up to **€20 million** or **4% of the company's total worldwide annual turnover**.\r\n \r\nIn addition, a specific offence can exist in national laws. For instance in France,'s**Penal Code**, there is a ***\"fraudulent, unfair or unlawful collection of personal data \"***, set out in Article 226-18.  \r\n Any **collection carried out fraudulently (e.g. without the knowledge of the persons concerned**), regardless of whether the data is public or not, is punishable by **five years' imprisonment** and a fine of **300,000 euros**.\r\n \r\nScraping can also be used to commit another offence: **infringement of the rights of the database producer**. Databases are protected by **copyright** and by a **sui generis right** (in its own right) protecting the producer of the database (articles L. 112-3 and L. 341-1 of the French Intellectual Property Code).  \r\n The database producer may **prohibit any extraction of a substantial part of the database**, as well as its **reuse by making it available to the public**.  \r\n The penalties incurred are a **300,000 euro fine** and **3 years' imprisonment**.","\u003Ch2 id=\"what-is-scraping\">What is scraping?\u003C/h2>\r\n\u003Cp>Data scraping is a computer technique whereby a programme extracts information from human-readable computer sources.\u003C/p>\r\n\u003Cp>\u003Cstrong>Data scraping\u003C/strong> makes it possible to extract and structure data from unorganised sources, from which the information may be difficult to understand and extract.\u003C/p>\r\n\u003Cp>There are many forms of scraping (parsing, report mining, screen scraping, etc.), but one of the most interesting is \u003Cstrong>web scraping\u003C/strong>. It involves \u003Cstrong>extracting information from web pages\u003C/strong>, usually in an \u003Cstrong>automated\u003C/strong> way, using \u003Cstrong>dedicated software.\u003C/strong>\u003C/p>\r\n\u003Ch2 id=\"how-is-scraping-used\">How is scraping used?\u003C/h2>\r\n\u003Cp>The technique can be used for \u003Cstrong>mass resale\u003C/strong>, by companies specialising in the collection and resale of data, or for \u003Cstrong>personal commercial canvassing\u003C/strong> by companies collecting data on their own behalf.\u003C/p>\r\n\u003Cp>Most often, the data collected is \u003Cstrong>identification or contact data\u003C/strong>, such as first and last name, telephone number, e-mail or home address, etc.\u003C/p>\r\n\u003Ch2 id=\"is-web-scraping-legal-outside-europe\">Is web scraping legal outside Europe?\u003C/h2>\r\n\u003Cp>The massive use of web scraping by the \u003Cstrong>internal services of social networks\u003C/strong> and the need for companies to use this data to \u003Cstrong>achieve their commercial objectives\u003C/strong> have recently led the American courts to rule in favour of the use of such tools.\u003C/p>\r\n\u003Cp>The principle under US law is that companies can carry out \u003Cstrong>any operation\u003C/strong> involving data as long as \u003Cstrong>the law does not prohibit it\u003C/strong>. A decision \u003Ca href=\"https://cdn.ca9.uscourts.gov/datastore/opinions/2022/04/18/17-16783.pdf\" rel=\"nofollow\">\u003Cem>\u003Cstrong>LinkedIn v HIQ of 18 April 2022\u003C/strong>\u003C/em>\u003C/a> gives precedence to *\u003Cem>the right to conduct business\u003C/em> over \u003Cstrong>the protection of users' privacy\u003C/strong>. The Court considers that people who choose to post personal information on their public profiles cannot claim that their personal data should remain private and unused, through tools such as scraping, for example.\u003C/p>\r\n\u003Cp>The approach is reversed in Europe, where the data controller must rely on an appropriate \u003Cstrong>\u003Ca href=\"https://www.dastra.eu/en/guide/legal-basis-for-processing/56301\">legal basis\u003C/a>\u003C/strong> before any use is made of the data. Otherwise, data processing will be \u003Cstrong>in principle unlawful\u003C/strong>.\u003C/p>\r\n\u003Ch2 id=\"in-europe-doesnt-the-greater-protection-afforded-to-personal-data-by-the-rgpd-preclude-such-uses\">In Europe, doesn't the greater protection afforded to personal data by the RGPD preclude such uses?\u003C/h2>\r\n\u003Cp>Data on platforms or social networks is very often \u003Cstrong>public\u003C/strong>, as no identification is required to consult it. However, it is still \u003Ca href=\"https://www.dastra.eu/en/guide/personal-data/56315\">\u003Cstrong>personal data\u003C/strong>\u003C/a>. As a result, the processing of such data falls under the scope of the \u003Cstrong>GDPR\u003C/strong> when it applies and the users of such tools become \u003Ca href=\"https://www.dastra.eu/en/guide/what-are-my-obligations-as-a-data-controller/56294\">\u003Cstrong>processing controllers\u003C/strong>\u003C/a>.\u003C/p>\r\n\u003Cp>A number of its articles are likely to be \u003Cstrong>violated by such a practice\u003C/strong>.\u003C/p>\r\n\u003Ch2 id=\"as-a-data-controller-subject-to-the-gdpr-can-we-still-use-scraping-on-platforms-such-as-linkedin-in-what-context\">As a data controller subject to the GDPR, can we still use scraping on platforms such as LinkedIn? In what context?\u003C/h2>\r\n\u003Cp>Web scraping is a tool governed by the GDPR, and its use is subject to 3 main conditions:\u003C/p>\r\n\u003Ch4 id=\"you-must-not-violate-the-general-terms-and-conditions-of-use-of-the-platform-from-which-the-data-originates\">1) You must not violate the general terms and conditions of use of the platform from which the data originates.\u003C/h4>\r\n\u003Cp>The first difficulty lies in the platform's \u003Cstrong>terms and conditions of use\u003C/strong>. LinkedIn's T&amp;Cs, for example, state: \"\u003Cem>You agree not to develop, support or use software, devices, scripts, robots or any other means or process designed to perform web scraping of the Services or otherwise copy profiles and other data from the Services\u003C/em>\".\u003C/p>\r\n\u003Cp>Thus, using scraping on LinkedIn \u003Cstrong>exposes the user to penalties\u003C/strong>, especially as the platform has implemented powerful algorithms to detect these tools.\u003C/p>\r\n\u003Ch4 id=\"complying-with-gdpr-obligations-regarding-direct-commercial-prospecting\">2) Complying with GDPR obligations regarding direct commercial prospecting\u003C/h4>\r\n\u003Cp>In terms of \u003Cstrong>commercial canvassing of individuals\u003C/strong>, the principle is the \u003Cstrong>prohibition of direct canvassing\u003C/strong> in the absence of \u003Cstrong>information\u003C/strong> and the \u003Cstrong>obtaining of prior consent\u003C/strong> from the individual. The only possible legal basis for commercial prospecting is therefore the consent of individuals.\u003C/p>\r\n\u003Cp>The only exception to this principle is when the individual, in the context of the platform, can reasonably expect their data to be \u003Cstrong>re-used\u003C/strong> for this purpose.\u003C/p>\r\n\u003Cp>On the subject of people's expectations and the use of scraping on LinkedIn, a \u003Ca href=\"https://www.legifrance.gouv.fr/cnil/id/CNILTEXT000042848036\" rel=\"nofollow\">\u003Cstrong>CNIL deliberation of 8 December 2020\u003C/strong>\u003C/a> is particularly enlightening.\u003C/p>\r\n\u003Cp>Nestor, a company selling meals in the workplace, used a \u003Cstrong>web scraping\u003C/strong> tool to build up a \u003Cstrong>customer database via the LinkedIn network\u003C/strong>. It invoked the legal basis of its \u003Cstrong>legitimate interest\u003C/strong> in prospecting professionals in order to build up and use this database. In fact, \u003Cstrong>professional-to-professional (B2B) canvassing\u003C/strong> on subjects \u003Cstrong>related to their business\u003C/strong> can be carried out \u003Cstrong>without their consent\u003C/strong>.\u003C/p>\r\n\u003Cp>However, the CNIL first considered that \"\u003Cem>the canvassing messages for the sale of meals at people's place of work \u003Cstrong>have little connection with the professional activity of the prospects\u003C/strong>\u003C/em>\", and then ruled that there had been a breach of the GDPR in that the company had failed to fulfil its obligations to \u003Cstrong>inform\u003C/strong> and \u003Cstrong>receive consent\u003C/strong>. The company was unable to rely on its legitimate interest in carrying out this \u003Ca href=\"https://www.dastra.eu/en/guide/data-processing-activity/56354\">\u003Cstrong>data processing\u003C/strong>\u003C/a>.\u003C/p>\r\n\u003Ch4 id=\"comply-with-the-main-principles-of-the-gdpr-applicable-to-all-data-processing\">3) Comply with the main principles of the GDPR applicable to all data processing.\u003C/h4>\r\n\u003Cp>These include the \u003Cstrong>failure to inform individuals\u003C/strong>, the \u003Cstrong>failure to consent\u003C/strong>, or the \u003Cstrong>failure to respect individuals' right to object\u003C/strong>, in particular when individuals have already objected to any re-use for canvassing purposes.\u003C/p>\r\n\u003Cp>In addition, if the company uses a service provider to supply the tool, it must ensure that the \u003Cstrong>obligations of Chapter IV of the GDPR on subcontracting\u003C/strong> are met.\u003C/p>\r\n\u003Cp>Lastly, it may be compulsory in certain cases to \u003Cstrong>carry out an \u003Ca href=\"https://www.dastra.eu/en/guide/data-protection-impact-assessment-pia-or-dpia/56278\">PIA / DPIA\u003C/a>\u003C/strong>. Even if it is not compulsory, given the characteristics of such processing, it is \u003Cstrong>recommended that one be carried out in any event\u003C/strong>.\u003C/p>\r\n\u003Cblockquote>\r\n\u003Cp>Dastra can help you comply with the GDPR with a simple and effective solution: \u003Cstrong>\u003Ca href=\"https://www.dastra.eu/en/contacts/demo\">ask us for a demo\u003C/a>\u003C/strong>.\u003C/p>\r\n\u003C/blockquote>\r\n\u003Ch2 id=\"what-are-the-penalties-for-non-compliance-with-these-obligations\">What are the penalties for non-compliance with these obligations?\u003C/h2>\r\n\u003Cp>\u003Cstrong>On the basis of the GDPR\u003C/strong>, it is possible to be condemned for violating \u003Cstrong>articles 5, 12 and 13\u003C/strong> of the GDPR (\u003Cstrong>principles of processing\u003C/strong> and \u003Cstrong>rights of individuals\u003C/strong>). \u003Cstrong>Article 83\u003C/strong> of the GDPR provides for an \u003Cstrong>administrative fine\u003C/strong> of up to \u003Cstrong>€20 million\u003C/strong> or \u003Cstrong>4% of the company's total worldwide annual turnover\u003C/strong>.\u003C/p>\r\n\u003Cp>In addition, a specific offence can exist in national laws. For instance in France,'s\u003Cstrong>Penal Code\u003C/strong>, there is a \u003Cem>\u003Cstrong>\"fraudulent, unfair or unlawful collection of personal data \"\u003C/strong>\u003C/em>, set out in Article 226-18.\u003Cbr />\r\nAny \u003Cstrong>collection carried out fraudulently (e.g. without the knowledge of the persons concerned\u003C/strong>), regardless of whether the data is public or not, is punishable by \u003Cstrong>five years' imprisonment\u003C/strong> and a fine of \u003Cstrong>300,000 euros\u003C/strong>.\u003C/p>\r\n\u003Cp>Scraping can also be used to commit another offence: \u003Cstrong>infringement of the rights of the database producer\u003C/strong>. Databases are protected by \u003Cstrong>copyright\u003C/strong> and by a \u003Cstrong>sui generis right\u003C/strong> (in its own right) protecting the producer of the database (articles L. 112-3 and L. 341-1 of the French Intellectual Property Code).\u003Cbr />\r\nThe database producer may \u003Cstrong>prohibit any extraction of a substantial part of the database\u003C/strong>, as well as its \u003Cstrong>reuse by making it available to the public\u003C/strong>.\u003Cbr />\r\nThe penalties incurred are a \u003Cstrong>300,000 euro fine\u003C/strong> and \u003Cstrong>3 years' imprisonment\u003C/strong>.\u003C/p>\r\n",null,1176,7,"GDPR and web scraping: a legal practice? ",0,"en","gdpr-and-web-scraping-a-legal-practice","To scrape or not to scrape a website?",false,"Published",{"id":18,"displayName":19,"avatarUrl":20,"bio":7,"blogUrl":7,"color":7,"userId":18,"creationDate":21},38,"Paul-Emmanuel Bidault","https://static.dastra.eu/tenant-27/avatar/38/paul-emmanuel-bidault-150.jpg","2019-12-03T19:09:28","2023-12-27T14:41:07.456","2023-12-27T15:41:06.1066416","2023-12-27T15:56:12.6193946",{"id":26,"name":27,"description":28,"url":29,"color":30,"parentId":7,"count":7,"imageUrl":7,"parent":7,"order":11,"translations":31},2,"Blog","A list of curated articles provided by the community","blog","#28449a",[32,35,38],{"lang":33,"name":27,"description":34},"fr","Une liste d'articles rédigés par la communauté",{"lang":36,"name":27,"description":37},"es","Una lista de artículos escritos por la comunidad",{"lang":39,"name":27,"description":40},"de","Eine Liste von Artikeln, die von der Community verfasst wurden",[42],{"id":26,"name":27,"description":28,"url":29,"color":30,"parentId":7,"count":7,"imageUrl":7,"parent":7,"order":11,"translations":43},[44,45,46],{"lang":33,"name":27,"description":34},{"lang":36,"name":27,"description":37},{"lang":39,"name":27,"description":40},[],"https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-1000.jpg",[50,51,52,53,54,55,56],"https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-1000.webp","https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash.webp","https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-1500.webp","https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-800.webp","https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-600.webp","https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-300.webp","https://static.dastra.eu/content/e6737547-f6e1-48ad-a18e-ef93d7d7e5ee/markus-spiske-iar-afb0qqw-unsplash-100.webp",56357]