Advanced Usage
Under Construction
This whole page is currently under construction.
Not worth your time at this point.
Under Construction
Just making sure the previous warning was not missed.
This page is not worth your time at this point.
Overview
This page covers the advanced internals of recipe-scrapers
, including abstract
classes, schema.org parsing, utility functions, exception handling, and the
plugin system. These components are primarily useful for contributors and
developers looking to extend the library's functionality.
Core Components
Abstract Base Classes
Classes
AbstractScraper
Source code in recipe_scrapers/_abstract.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 |
|
Functions
author()
canonical_url()
Canonical or original URL of the recipe.
category()
cook_time()
cooking_method()
cuisine()
description()
dietary_restrictions()
The specified dietary restrictions or guidelines for which this recipe is suitable
equipment()
host()
classmethod
image()
ingredient_groups()
ingredients()
instructions()
instructions_list()
keywords()
language()
Language the recipe is written in.
Source code in recipe_scrapers/_abstract.py
links()
nutrients()
prep_time()
ratings()
ratings_count()
reviews()
site_name()
title()
to_json()
Recipe information in JSON format.
Source code in recipe_scrapers/_abstract.py
total_time()
Utility Functions
Classes
Functions
change_keys(obj, convert)
Recursively goes through the dictionary obj and replaces keys with the convert function
Useful for fixing incorrect property keys, e.g. in JSON-LD dictionaries
Credit: StackOverflow user 'baldr' (https://web.archive.org/web/20201022163147/https://stackoverflow.com/questions/11700705/python-recursively-replace -character-in-keys-of-nested-dictionary/33668421)
Note: with modifications applied.
Source code in recipe_scrapers/_utils.py
get_yields(element)
Will return a string of servings or items, if the recipe is for number of items and not servings the method will return the string "x item(s)" where x is the quantity. Returns a string of servings or items. If the recipe is for a number of items (not servings), it returns "x item(s)" where x is the quantity. This function handles cases where the yield is in dozens, such as "4 dozen cookies", returning "4 dozen" instead of "4 servings". Additionally accommodates yields specified in batches (e.g., "2 batches of brownies"), returning the yield as stated. :param element: Should be BeautifulSoup.TAG, in some cases not feasible and will then be text. :return: The number of servings or items. :return: The number of servings, items, dozen, batches, etc...
Source code in recipe_scrapers/_utils.py
Exception Handling
Classes
ElementNotFoundInHtml
Bases: RecipeScrapersExceptions
Error when we cannot locate the HTML element on the page
Source code in recipe_scrapers/_exceptions.py
FieldNotProvidedByWebsiteException
Bases: StaticValueException
Error when, as far as we know, the website does not provide this info for any recipes.
Source code in recipe_scrapers/_exceptions.py
FillPluginException
Bases: RecipeScrapersExceptions
Inability to locate an element on a page by using a fill plugin
Source code in recipe_scrapers/_exceptions.py
NoSchemaFoundInWildMode
Bases: RecipeScrapersExceptions
The scraper was unable to locate schema.org metadata within the webpage.
Source code in recipe_scrapers/_exceptions.py
OpenGraphException
Bases: FillPluginException
Unable to locate element on the page using OpenGraph metadata
Source code in recipe_scrapers/_exceptions.py
RecipeSchemaNotFound
Bases: SchemaOrgException
No recipe schema metadata found on the page
Source code in recipe_scrapers/_exceptions.py
SchemaOrgException
Bases: FillPluginException
Error in parsing or missing portion of the Schema.org data on the page
Source code in recipe_scrapers/_exceptions.py
StaticValueException
Bases: RecipeScrapersExceptions
Error to communicate that the scraper is returning a fixed/static value.
Source code in recipe_scrapers/_exceptions.py
WebsiteNotImplementedError
Bases: RecipeScrapersExceptions
Error when website is not supported by this library.
Source code in recipe_scrapers/_exceptions.py
Plugin System
Classes
ExceptionHandlingPlugin
Bases: PluginInterface
Plugin that is used only if settings.SUPPRESS_EXCEPTIONS is set to True.
The outer-most plugin and decorator.
If ANY of the methods listed raises ANY kind of exception, silence it and return the respective value from settings.ON_EXCEPTION_RETURN_VALUES
If settings.SUPPRESS_EXCEPTIONS is set to False this plugin is ignored and does nothing. (In other words exceptions won't be handled and will bubble up to program's explosion. Left to the end-user to handle them on his own).
Source code in recipe_scrapers/plugins/exception_handling.py
HTMLTagStripperPlugin
Bases: PluginInterface
Run the output from the methods listed through the stripper function defined above.
It is intended to strip away
Source code in recipe_scrapers/plugins/html_tags_stripper.py
NormalizeStringPlugin
Bases: PluginInterface
Explicitly run the output from the methods listed through normalize_string
Source code in recipe_scrapers/plugins/normalize_string.py
OpenGraphFillPlugin
Bases: PluginInterface
If any of the methods listed is invoked on a scraper class that happens not to be implemented, attempt to return results by checking for OpenGraph metadata.
Source code in recipe_scrapers/plugins/opengraph_fill.py
OpenGraphImageFetchPlugin
Bases: PluginInterface
If .image() method on whatever scraper return exception for some reason, do try to fetch the recipe image from the og:image on the page.
Apply to .image() method on all scrapers if plugin is active.
Source code in recipe_scrapers/plugins/opengraph_image_fetch.py
SchemaOrgFillPlugin
Bases: PluginInterface
If any of the methods listed is invoked on a scraper class that happens not to be implement and Schema.org is available attempt to return the results from the schema available.
Source code in recipe_scrapers/plugins/schemaorg_fill.py
StaticValueExceptionHandlingPlugin
Bases: PluginInterface
Handles cases where a scraper indicates that it returns a static value -- perhaps because the website never provides info for that method at all (communicated by FieldNotProvidedByWebsiteException), or because for some reason it is easier or more convenient to define statically (communicated by StaticValueException).
Objects of StaticValueException and subclasses include a return value, so we return that to the caller instead after emitting a suitable warning for use by developers/users.