Remote configuration
If you are using the base Extractor
class, you can automatically fetch configuration from the extraction pipelines API in CDF.
To use this, all you need to do is create a minimal configuration file like this:
type: remote
cognite:
# Read these from environment variables
host: ${COGNITE_BASE_URL}
project: ${COGNITE_PROJECT}
idp-authentication:
token-url: ${COGNITE_TOKEN_URL}
client-id: ${COGNITE_CLIENT_ID}
secret: ${COGNITE_CLIENT_SECRET}
scopes:
- ${COGNITE_BASE_URL}/.default
extraction-pipeline:
external-id: my-extraction-pipeline
containing only the cognite
section, and the type: remote
flag, and the configuration will be loaded dynamically from extraction pipelines.
You can, for example, use the upload config github action to publish configuration files from a github repository. Remote configuration files are combined with local configuration and loaded into the extractor on startup.
Detecting config changes
When using the base class, you have the option to automatically detect new
config revisions, and do one of several predefined actions (keep in mind that
this is not exclusive to remote configs, if the extractor is running with a
local configuration that changes, it will do the same action). You specify which
with an reload_config_action
enum. The enum can be one of the following values:
DO_NOTHING
which is the defaultREPLACE_ATTRIBUTE
which will replace theconfig
attribute on the object (keep in mind that if you are using therun_handle
instead of subclassing, this will have no effect). Be also aware that anything that is set up based on the config (upload queues, cognite client objects, loggers, connections to source, etc) will not change in this case.SHUTDOWN
will set thecancellation_token
event, and wait for the extractor to shut down. It is then intended that the service layer running the extractor (ie, windows services, systemd, docker, etc) will be configured to always restart the service if it shuts down. This is the recomended approach for reloading configs, as it is always guaranteed that everything will be re-initialized according to the new configuration.CALLBACK
is similar toREPLACE_ATTRIBUTE
with one difference. After replacing theconfig
attribute on the extractor object, it will call thereload_config_callback
method, which you will have to override in your subclass. This method should then do any necessary cleanup or re-initialization needed for your particular extractor.
To enable detection of config changes, set the reload_config_action
argument
to the Extractor
constructor to your chosen action:
# Using run handle:
with Extractor(
name="my_extractor",
description="Short description of my extractor",
config_class=MyConfig,
version="1.2.3",
run_handle=run_extractor,
reload_config_action=ReloadConfigAction.SHUTDOWN,
) as extractor:
extractor.run()
# Using subclass:
class MyExtractor(Extractor):
def __init__(self):
super().__init__(
name="my_extractor",
description="Short description of my extractor",
config_class=MyConfig,
version="1.2.3",
reload_config_action=ReloadConfigAction.SHUTDOWN,
)
The extractor will then periodically check if the config file has changed. The
default interval is 5 minutes, you can change this by setting the
reload_config_interval
attribute. As with any other interval in
extractor-utils, the unit is seconds.