How to Scrape Yelp With Python (2025 Guide)
Learn how to scrape Yelp data, including business details, reviews, and ratings, using Python. This guide covers the essentials to help you easily extract Yelp data with a simple script.
Yelp is a widely used platform where customers share reviews and experiences about local businesses. Since its launch in 2004, Yelp has grown to include:
- 287 million reviews across various categories like restaurants, shopping, and home services
- Over 7 million businesses are listed, making it a go-to source for discovering local businesses (Source)
With millions of visitors each month, Yelp provides valuable insights into customer preferences and market trends. This makes it a powerful tool for businesses and researchers who want to analyze market trends, scrape Yelp reviews, and understand customer sentiment.
Python Tutorial: Scraping Yelp Reviews with Unwrangle API
Step 1: Prerequisites
Before you begin, ensure you have the following:
- API Key: Sign up on Unwrangle to get your API key.
- yelp-biz-id: This ID is unique to each business on Yelp. You can find it by using the inspect element and searching for yelp-biz-id.
-
Python Installed: Make sure Python 3.x is installed on your system.
-
Requests Library: If you don't already have the requests library, install it by running:
Step 2: Making a Basic API Request
To scrape Yelp reviews, you need to make a GET request to the /api/getter endpoint with the following query parameters:
- Platform: Set this to "yelp_reviews".
- yelp-biz-id: The unique Yelp business ID.
- Api_key: Your Unwrangle API key.
- Page (optional): Specifies the page number of results. Default is 1.
Here's a Python example:
Step 3: Response Format
The API returns a JSON object containing the reviews and metadata. Here's a quick overview of the key fields:
Meta Information:
- success: Indicates whether the API call was successful.
- page: The current page of results.
- total_results: Total number of reviews available.
- no_of_pages: Total pages for all reviews.
- result_count: Number of reviews on the current page.
Review Details
Each review in the reviews array contains the following attributes:
Attribute | Data Type | Description |
---|---|---|
id | string | Yelp's unique ID for the review |
date | string | Date when the review was published |
rating | integer | Star rating provided by the reviewer (1-5) |
review_text | string | The full text content of the review |
review_url | string | Direct link to the review on Yelp |
lang | string | Two-letter language code for the review (e.g., en) |
author_avatar | string | URL of the reviewer's profile avatar |
author_name | string | Name of the reviewer |
author_url | string | Link to the reviewer's Yelp profile |
review_imgs | list | Links to images included in the review (if any) |
meta_data | dict | Feedback metrics including useful, funny, and cool votes |
location | string | City and state of the reviewer |
response | dict | Contains the business owner's response to the review, if available |
Step 4: Handling the API Response
To process the response and extract useful information:
- Parse the JSON response into a Python dictionary.
- Access the metadata (e.g., total reviews and pages).
- Iterate through the reviews array to extract individual review details.
Here's Code to Parse Reviews:
Here's how preview would look like:
Get Started with Unwrangle
Sign up for Unwrangle to:
- Access Yelp reviews through a simple API call.
- Get structured JSON responses with review data.
- Avoid dealing with proxies and CAPTCHAs.