Skip to content
Snippets Groups Projects

Ensemble Tools Feature Detection: enstools.feature

This package is a module of the enstools python package, developed within the framework of Waves to Weather - Transregional Collaborative Research Project (SFB/TRR165). enstools can be fetched from the public github repo here.

enstools.feature is a modular framework for identificaton and tracking of meteorological structures with the aims of providing easy-to-use interfaces for automatic parallelization and unified and readable output. For the latter one, this framework uses protobuf as description typed structures, where users can simply define descriptions for to-detected structures, and using them for further statistical analyses.

Installation instructions

We recommend using a conda environment. Install instructions from the public enstools repo serve as base and have been adapted and extended.

conda create --name enstools-feature python=3.7
conda activate enstools-feature

# install requirements listed in given venv_setup.sh
pip install --upgrade pip
pip install wheel numpy==1.20.0

# integrate enstools
pip install -e git+https://github.com/wavestoweather/enstools.git@main#egg=enstools

# install requirements for enstools-feature, and install enstools in this environment
conda install --file requirements.txt
pip install -e .

Additionally, depending on the used feature identification strategies, additional packages may be required. // TODO

Usage: Applying existing techniques

Here is a usage example, if you want to apply existing techniques in the code base to your data set. First, we need some imports, namely the

  • FeaturePipeline, which executes the identification pipeline

  • IdentificationTemplate, this is the identification technique, edit this accordingly

  • TrackingTemplate, this is the tracking technique, edit this accordingly

  • template_pb2, this is the on run auto-generated protobuf python file from the set description. Use the one that matches your detection strategy. They are named *_pb2, where * is the name of the identification module. -> TODO: should not really need to set the template here, is specific to identification strategy!

    from enstools.feature.pipeline import FeaturePipeline from enstools.feature.identification.template import IdentificationTemplate from enstools.feature.tracking.template import TrackingTemplate from enstools.feature.identification._proto_gen import template_pb2

Then, we initialize the pipeline with the protobuf description and optional the processing mode. For 3D data, this resembles if identification should be performed individual on 2D (latlon) or 3D subsets.

pipeline = FeaturePipeline(template_pb2, processing_mode='2d')

Then, we initialize and set our strategies. The tracking can be set to None to be ignored.

i_strat = IdentificationTemplate(some_parameter='foo')
t_strat = TrackingCompareTemplate()
pipeline.set_identification_strategy(i_strat)
pipeline.set_tracking_strategy(t_strat) # or None as argument if no tracking

Next, set the data to process.

pipeline.set_data_path(path)

Then, the pipeline can be executed, starting the identification and subsequently the tracking.

pipeline.execute()
# or separated...
# pipeline.execute_identification()
# pipeline.execute_tracking()

This generates an object description based on the set protobuf format. If tracking has been used, tracks based on a default simple heuristic can be generated. See docstrings for further details. The object description holds the objects, and if tracking has been executed a graph structure and the generated tracks respectively.

pipeline.generate_tracks()
od = pipeline.get_object_desc()

The output data set and description can be saved:

pipeline.save_result(description_type='json', description_path=..., dataset_path=...)
  • TODO what we provide, list different techniques...

Usage: Adding techniques

We provide some template files, which we recommend as a starting point for your own identification strategy. If you want to add your own identification (and tracking) strategy to the framework, you need to:

  • Copy over the template folder and rename it and the files accordingly. If you implement a tracking method, which relys on pairwise comparison of objects from consecutive timesteps, you can use the template_object_compare

  • In the __init__.py, rename the class name to your identification strategy.

  • In the *.proto file, define the variables each of the detected objects should have. They follow the protobuf protocol, see here. The template file also provides a useful example. proto-files are compiled automatically on running the identification.

  • In the identification.py (tracking.py), implement your identification (tracking) strategy. See the template again for a useful example. There are a few methods: ** __init__ gets called from the run script, so the user can set parameters for the algorithm here. ** precompute is called once for the entire data set. The data set can be altered here (temporally and spatially). Also if the strategy should return an additional field (DataArray), it should be initialized here as shown in the template. ** In identify goes your identification technique. This method is called in parallel, and should return a list of objects. See the template and the docstrings for more information. ** postprocess is called once for the entire data set after identification. The data set and the object description can be changed here.

  • TODO tracking

Acknowledgment and license

enstools.feature is a collaborative development within Waves to Weather (SFB/TRR165) project, and funded by the German Research Foundation (DFG).

A full list of code contributors can CONTRIBUTORS.md. TODO

The code is released under an Apache-2.0 licence.