Web scraping in the hotels, hospitality and leisure industry
Insight
For travel and hospitality businesses, real-time market insights are crucial to remain competitive in an industry that is constantly changing. Businesses will want to track competitor pricing, customer reviews and market trends, and to do this as quickly and efficiently as possible. This can help inform strategic decision making, maximising competitiveness and profitability.
Web scraping enables the fast collection of large volumes of data and is commonly used in the hotels, hospitality and leisure industry. Increasingly, scraping is being used in combination with artificial intelligence (which analyses the data that has been scraped) to anticipate customer needs, act on real-time market insights, and stay ahead of competitors.
Web scraping is controversial, but whether it is unlawful is not a straightforward question to answer. This article aims to summarise the main legal considerations and offer some practical insights into how to minimise risks whichever side of the web scraping line you fall.
What is web scraping?
Web scraping typically involves the use of a software application (a "bot") to extract large amounts of data from public-facing websites. This could include text, reviews and pricing information. The harvested data can then be used for data analysis (for example, identifying market trends and patterns), or it may be republished in some form (for example, on price comparison websites).
How is web scraping used in the hotels, hospitality and leisure industry?
Web scraping is typically used in the industry to gather data on the price and availability of flights, hotels, tours, events and car rentals. This data is then used to adjust pricing strategies accordingly (for example, increasing ticket prices for particularly popular days or times – known as surge or dynamic pricing). Web scraping is used across social media and review sites to gauge customer sentiment, which helps tailor services to meet customer expectations and preferences. Harvesting information from industry news websites and travel blogs can also help predict upcoming trends that may affect demand. In short, when combined with increasingly powerful AI tools, web scraping can deliver valuable insights and competitive advantage.
What are the legal issues?
There is no specific law that prohibits web scraping. However, the manner and purpose of web scraping can lead to various legal issues, particularly concerning intellectual property, contractual terms and data protection. Businesses that want to use data which has been scraped (either by themselves or by a third party) will need to be aware of these issues to try and stay the right side of the line in case legal challenges are made. Equally, knowledge of these issues can help a business which wants to prevent others from scraping or re-utilising their data.
Copyright
Copyright arises automatically and protects original artistic and literary works. This includes tables or compilations (if "original") and databases which, by reason of the selection or arrangement of their contents, constitute the author’s own intellectual creation (which involves a higher threshold of originality than for other copyright works).
Scraping copyrighted work without a licence and/or then communicating a substantial part of that work to the public (for example, by reproducing it on the scraper’s own website) is likely to be infringement. The key issues are often whether the data that has been taken is protected by copyright and/or whether the business from which it has been taken owns that copyright (it may have been created for them by someone else).
The UK Government has recently held a consultation on plans to reform copyright law to address the competing interests of rights holders and AI developers (who wish to use processes such as web scraping over copyrighted work to train AI models). It has sought comment on its plans to permit text and data mining for commercial purposes – AI developers would then be able to train models on large volumes of web-based material without risk of infringement (unless rightsholders expressly opt out, in which case a licence would be required). This would bring the UK more closely in line with the EU regime. The plan is controversial and has attracted a lot of criticism but remains under consideration.
Database right
A website can also be protected by a separate database right under the Copyright and Rights in Databases Regulations 1997, where it consists of "a collection of independent works, data or other materials which are arranged in a systematic or methodical way and are individually accessible by electronic or other means”. Database right arises automatically and exists independently of copyright. For it to apply, there must have been a “substantial investment” in obtaining, verifying and presenting the contents of the database. In other words, it must be shown that effort was made to find existing materials and collect these to form the database. So, if a website provider has spent significant time and resource finding and putting together a collection of existing flight and accommodation options and has carefully organised these options on their website in a systematic or methodical way, then that collection may be protected by database right. However, any investment used to create the database contents themselves does not count (for example, an airline drawing up its own flight schedules). This can often mean that the database right is quite narrow and does not protect the data that is being scraped.
Database right does not apply to individual pieces of data within the database but instead protects the database as a whole. The right is infringed if a person – including a web scraper – extracts or re-utilises all or a "substantial" part of the contents of the database. "Substantial" here can refer to quantity and/or quality. Even if only a small part of a database is extracted or re-utilised, if that is done repeatedly then this could also be infringement.
Contractual terms
A website’s terms and conditions may contain clauses expressly prohibiting web scraping. So long as those terms are validly incorporated and contractually enforceable (which is not always the case – for example, if a user is not required to click to accept the terms or the terms are not clearly referenced on a website), then scraping that website may be a breach of contract. Provided that access to the website content is constructed in the right way so that the terms and conditions apply, then this can often be the most straightforward means to stop scraping.
Data protection
If website data includes any information which identifies living individuals – i.e. personal data – anyone wishing to carry out scraping will need to comply with the UK GDPR. This will be particularly relevant when seeking to scrape social media or customer review platforms.
Processing of personal data must be done lawfully, fairly and in a transparent manner. Legitimate interests is likely to be the only applicable lawful basis for scraping (the ICO has confirmed that this can be the case for using web-scraped data to train generative AI): i.e. the scraper needs to show that the scraping is necessary for their legitimate interests or the legitimate interests of a third party, unless there is a good reason to protect the individual’s personal data which overrides those legitimate interests. If relying on this ground for processing the personal data, it would be advisable to document the decision-making process through a Legitimate Interests Assessment.
Fairness and transparency are likely to be significant challenges in the context of web scraping because individuals will most likely be unaware that their personal data is being scraped. In other words, web scraping of personal data may be "invisible processing". In this case it would be advisable to prepare a Data Protection Impact Assessment (DPIA). A DPIA is a process designed to help organisations identify and minimise data protection risks. Any failure by an organisation to carry out a DPIA when it should have done so can lead to a heavy regulatory fine from the ICO.
The issue with relying on data protection rights to stop web scraping and subsequent utilisation of personal data is that these rights strictly reside with the individuals whose data is being taken and used. However, in our experience the English court is likely to be sympathetic to a business seeking to protect the rights of individuals like its customers or reviewers whose data protection rights may have been infringed.
What should businesses do?
Travel and hospitality businesses which want to take advantage of the benefits of web scraping should try to ensure that they have a basis to argue that they have not acted unlawfully. That is still the case if a business engages a third party to carry out the scraping (as if this is done unlawfully by a third party, the business may still be liable for database right or copyright infringement through common design, or liable for inducing a breach of contract by that third party). It will be important for businesses to be alive to the legal issues surrounding web scraping and keep their own practices under review in this context.
Equally, rightsholders may wish to prevent others from scraping their websites. Those rightsholders should as a first step seek to ensure that their website terms and conditions make it clear that scraping is not allowed and that access to the website is constructed in such a way that those terms and conditions bite on those planning to engage in scraping. In addition, technical solutions to stop scraping are also very important, such as robots.txt files (which contain instructions for bots on what they may and may not access), CAPTCHA (a security test that prevents bots from accessing websites) and paywalls.
This publication is a general summary of the law. It should not replace legal advice tailored to your specific circumstances.
© Farrer & Co LLP, April 2025