Contents ======== * `Contents <#contents>`__ * `EII TimeSeriesProfiler <#eii-timeseriesprofiler>`__ * `Prerequisites <#prerequisites>`__ * `EII TimeSeriesProfiler Mode <#eii-timeseriesprofiler-modes>`__ * `EII TimeSeriesProfiler Configuration <#eii-timeseriesprofiler-configurations>`__ * `Running TimeSeriesProfiler <#running-timeseriesprofiler>`__ EII TimeSeriesProfiler ---------------------- #. This module calculates the SPS (Samples Per Second) of any EII time-series modules based on the stream published by that respective module. #. This module calculates the average end-to-end time for every sample of data to be processed and its breakup. The end-to-end is required for a metric from mqtt-publisher to TimeSeriesProfiler (mqtt-publisher->telegraf->influx->kapacitor->influx->datastore->TimeSeriesProfiler). Prerequisites ^^^^^^^^^^^^^ #. TimeSeriesProfiler expects a set of config, interfaces and public or private keys to be present in ETCD as a prerequisite. To achieve this, ensure an entry for TimeSeriesProfiler with its relative path from IEdgeInsights(\ ``[WORK_DIR]/IEdgeInsights/``\ ) directory is set in the time-series.yml file present in build/usecases(\ ``[WORK_DIR]/IEdgeInsights/build/usecases``\ ) directory. Following is an example: .. code-block:: sh AppContexts: - ConfigMgrAgent - Visualizer/multimodal-data-visualization-streaming/eii/ - Visualizer/multimodal-data-visualization/eii - DataStore - Kapacitor - Telegraf - tools/TimeSeriesProfiler #. With the previous pre-requisite done, please run the following command: .. code-block:: sh python3 builder.py -f ./usecases/time-series.yml EII TimeSeriesProfiler Mode ^^^^^^^^^^^^^^^^^^^^^^^^^^^ By default, the EII TimeSeriesProfiler supports two modes, which are "sps" and "monitor" mode. #. SPS mode This mode is enabled by setting the "mode" key in config(\ ``[WORK_DIR]/IEdgeInsights/tools/TimeSeriesProfiler/config.json``\ ) to "sps". This mode calculates the samples per second of any EII module by subscribing to that module's respective stream. .. code-block:: sh "mode": "sps" #. Monitor mode This mode is enabled by setting the "mode" key in config(\ ``[WORK_DIR]/IEdgeInsights/tools/TimeSeriesProfiler/config.json``\ ) to "monitor". This mode calculates average and per-sample stats. Refer to the following example config where TimeSeriesProfiler is used in monitor mode: .. code-block:: javascript "config": { "mode": "monitor", "monitor_mode_settings": { "display_metadata": false, "per_sample_stats":false, "avg_stats": true }, "total_number_of_samples" : 5, "export_to_csv" : false } .. code-block:: sh "mode": "monitor" The stats to be displayed by the tool in monitor_mode can be set in the monitor_mode_settings key of config.json(\ ``[WORK_DIR]/IEdgeInsights/tools/TimeSeriesProfiler/config.json``\ ). #. 'display_metadata':It displays the raw meta-data with timestamps associated with every sample. #. 'per_sample_stats':It continuously displays the per-sample metrics of every sample. #. 'avg_stats':It continuously displays the average metrics of every sample. .. note:: * Running in profiling or monitoring mode requires the following prerequisites: PROFILING_MODE should be set to **true** in .env(\ ``[WORK_DIR]/IEdgeInsights/build/.env``\ ) time series containers. * For running TimeSeriesProfiler in SPS mode, it is recommended to keep PROFILING_MODE set to false in .env(\ ``[WORK_DIR]/IEdgeInsights/build/.env``\ ) for better performance. EII TimeSeriesProfiler Configuration ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #. total_number_of_samples If mode is set to'sps', the average SPS is calculated for the number of samples set by this variable. If mode is set to 'monitor', the average stats are calculated for the number of samples set by this variable. Setting it to (-1) will run the profiler forever unless terminated by stopping the container TimeSeriesProfiler manually. total_number_of_samples should never be set as (-1) for 'sps' mode. #. export_to_csv Setting this switch to **true** exports csv files for the results obtained in TimeSeriesProfiler. For monitor_mode, runtime stats printed in the csv are based on the following precdence: avg_stats, per_sample_stats, display_metadata. Running TimeSeriesProfiler ^^^^^^^^^^^^^^^^^^^^^^^^^^ #. Prerequisite: Profiling UDF returns "ts_kapacitor_udf_entry" and "ts_kapacitor_udf_exit" timestamps. The following are two examples: #. profiling_udf.go(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/udfs/profiling_udf.go``\ ) #. rfc_classifier.py(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/udfs/rfc_classifier.py``\ ) * **Additional:** Adding timestamps in ingestion and UDFs: To enable a user's own ingestion and UDFs, timestamps must be added to the ingestion and UDFs modules, respectively.The TS Profiler needs three timestamps: #. "ts" timestamp which is to be filled by the ingestor (done by the mqtt-publisher app). #. The UDF must give "ts_kapacitor_udf_entry" and "ts_kapacitor_udf_exit" timestamps to profile the UDF execution time. ts_kapacitor_udf_entry:timestamp in UDF before execution of the algorithm ts_kapacitor_udf_exit:timestamp in UDF after execution of the algorithm. The sample profiling UDFs can be referred to at profiling_udf.go(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/udfs/profiling_udf.go``\ ) and rfc_classifier.py(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/udfs/rfc_classifier.py``\ ). * The configuration required to run profiling_udf.go as a profiling UDF In **Kapacitor config.json(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/config.json``\ )** , update "task" key as follows: .. code-block:: sh "task": [{ "tick_script": "profiling_udf.tick", "task_name": "profiling_udf", "udfs": [{ "type": "go", "name": "profiling_udf" }] }] In **kapacitor.conf(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/config/kapacitor.conf``\ )**\ , update udf section: .. code-block:: sh [udf.functions] [udf.functions.profiling_udf] socket = "/tmp/profiling_udf" timeout = "20s" * The configuration required to run rfc_classifier.py as a profiler UDF is as follows: In **Kapacitor config.json(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/config.json``\ )** , update "task" key as follows: .. code-block:: sh "task": [{ { "tick_script": "rfc_task.tick", "task_name": "random_forest_sample" } }] In **kapacitor.conf(\ ``[WORK_DIR]/IEdgeInsights/Kapacitor/config/kapacitor.conf``\ )** update udf section: .. code-block:: sh [udf.functions.rfc] prog = "python3.7" args = ["-u", "/EII/udfs/rfc_classifier.py"] timeout = "60s" [udf.functions.rfc.env] PYTHONPATH = "/EII/go/src/github.com/influxdata/kapacitor/udf/agent/py/" Keep the config.json(\ ``[WORK_DIR]/IEdgeInsights/tools/TimeSeriesProfiler/config.json``\ ) file as follows: .. code-block:: sh { "config": { "total_number_of_samples": 10, "export_to_csv": "False" }, "interfaces": { "Subscribers": [ { "Name": "default", "Type": "zmq_tcp", "EndPoint": "ia_datastore:65032", "PublisherAppName": "DataStore", "Topics": [ "rfc_results" ] } ] } } In .env(\ ``[WORK_DIR]/IEdgeInsights/build/.env``\ ): Set the profiling mode to true. #. Set environment variables accordingly in config.json(\ ``[WORK_DIR]/IEdgeInsights/tools/TimeSeriesProfiler/config.json``\ ). #. Set the required output stream or streams and the appropriate stream config in the config.json(\ ``[WORK_DIR]/IEdgeInsights/tools/TimeSeriesProfiler/config.json``\ ) file. #. To run this tool in IPC mode, the user must make the following changes to the subscribers interface section of [config.json] (./config.json)**:  .. code-block:: sh { "type": "zmq_ipc", "EndPoint": "/EII/sockets" } #. To provision, build, and run the tool along with the EII time-series recipe or stack, see `README.md `_. #. Run the following command to see the logs: .. code-block:: sh docker logs -f ia_timeseries_profiler