Description
I can get the Triton server up with the log parsing model:
=============================
== Triton Inference Server ==
=============================
NVIDIA Release 23.06 (build 62878575)
Triton Server Version 2.35.0
Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
I0916 14:20:22.782334 1 libtorch.cc:2253] TRITONBACKEND_Initialize: pytorch
I0916 14:20:22.782371 1 libtorch.cc:2263] Triton TRITONBACKEND API version: 1.13
I0916 14:20:22.782378 1 libtorch.cc:2269] 'pytorch' TRITONBACKEND API version: 1.13
I0916 14:20:22.921466 1 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f796c000000' with size 268435456
I0916 14:20:22.921727 1 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0916 14:20:22.924998 1 model_lifecycle.cc:462] loading: log-parsing-onnx:1
I0916 14:20:22.933856 1 onnxruntime.cc:2530] TRITONBACKEND_Initialize: onnxruntime
I0916 14:20:22.933869 1 onnxruntime.cc:2540] Triton TRITONBACKEND API version: 1.13
I0916 14:20:22.933874 1 onnxruntime.cc:2546] 'onnxruntime' TRITONBACKEND API version: 1.13
I0916 14:20:22.933879 1 onnxruntime.cc:2576] backend configuration:
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0916 14:20:22.946376 1 onnxruntime.cc:2641] TRITONBACKEND_ModelInitialize: log-parsing-onnx (version 1)
I0916 14:20:22.947872 1 onnxruntime.cc:692] skipping model configuration auto-complete for 'log-parsing-onnx': inputs and outputs already specified
I0916 14:20:22.948285 1 onnxruntime.cc:2702] TRITONBACKEND_ModelInstanceInitialize: log-parsing-onnx (GPU device 0)
2023-09-16 14:20:23.802630183 [W:onnxruntime:, session_state.cc:1169 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-09-16 14:20:23.802647784 [W:onnxruntime:, session_state.cc:1171 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
I0916 14:20:24.035642 1 model_lifecycle.cc:815] successfully loaded 'log-parsing-onnx'
I0916 14:20:24.035741 1 server.cc:603]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0916 14:20:24.035938 1 server.cc:630]
+-------------+-------------------------------+-------------------------------+
| Backend | Path | Config |
+-------------+-------------------------------+-------------------------------+
| pytorch | /opt/tritonserver/backends/py | {} |
| | torch/libtriton_pytorch.so | |
| onnxruntime | /opt/tritonserver/backends/on | {"cmdline":{"auto-complete-co |
| | nxruntime/libtriton_onnxrunti | nfig":"true","backend-directo |
| | me.so | ry":"/opt/tritonserver/backen |
| | | ds","min-compute-capability": |
| | | "6.000000","default-max-batch |
| | | -size":"4"}} |
| | | |
| | | |
+-------------+-------------------------------+-------------------------------+
I0916 14:20:24.035975 1 server.cc:673]
+------------------+---------+--------+
| Model | Version | Status |
+------------------+---------+--------+
| log-parsing-onnx | 1 | READY |
+------------------+---------+--------+
I0916 14:20:24.062625 1 metrics.cc:808] Collecting metrics for GPU 0: NVIDIA GeForce GTX 1650
I0916 14:20:24.062952 1 metrics.cc:701] Collecting CPU metrics
I0916 14:20:24.063228 1 tritonserver.cc:2385]
+----------------------------------+------------------------------------------+
| Option | Value |
+----------------------------------+------------------------------------------+
| server_id | triton |
| server_version | 2.35.0 |
| server_extensions | classification sequence model_repository |
| | model_repository(unload_dependents) sch |
| | edule_policy model_configuration system_ |
| | shared_memory cuda_shared_memory binary_ |
| | tensor_data parameters statistics trace |
| | logging |
| model_repository_path[0] | /models/triton-model-repo |
| model_control_mode | MODE_EXPLICIT |
| startup_models_0 | log-parsing-onnx |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
+----------------------------------+------------------------------------------+
I0916 14:20:24.069223 1 grpc_server.cc:2445] Started GRPCInferenceService at 0.0.0.0:8001
I0916 14:20:24.069385 1 http_server.cc:3555] Started HTTPService at 0.0.0.0:8000
I0916 14:20:24.111323 1 http_server.cc:185] Started Metrics Service at 0.0.0.0:8002
W0916 14:20:25.064985 1 metrics.cc:573] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0916 14:20:26.065373 1 metrics.cc:573] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0916 14:20:27.067833 1 metrics.cc:573] Unable to get power limit for GPU 0. Status:Success, value:0.000000
So I want to run the log-parsing pipeline
--num_threads 1 \
--input_file ${MORPHEUS_ROOT}/models/datasets/validation-data/log-parsing-validation-data-input.csv \
--output_file ./log-parsing-output.jsonlines \
--model_vocab_hash_file=${MORPHEUS_ROOT}/morpheus/data/bert-base-cased-hash.txt \
--model_vocab_file=${MORPHEUS_ROOT}/models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
--model_seq_length=256 \
--model_name log-parsing-onnx \
--model_config_file=${MORPHEUS_ROOT}/models/log-parsing-models/log-parsing-config-20220418.json \
--server_url localhost:8001
But there is no requirements.txt nor conda .yml file to create an environment inside the .dev container launched by VSCode for the Morpheus repo. Can you make one? I went through and found the MRC repo after an error popped up telling me I need MRC but then I am stuck on
(mrc2) coder ➜ /workspaces/Morpheus/examples/log_parsing $ python run.py \
--num_threads 1 \
--input_file ${MORPHEUS_ROOT}/models/datasets/validation-data/log-parsing-validation-data-input.csv \
--output_file ./log-parsing-output.jsonlines \
--model_vocab_hash_file=${MORPHEUS_ROOT}/morpheus/data/bert-base-cased-hash.txt \
--model_vocab_file=${MORPHEUS_ROOT}/models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
--model_seq_length=256 \
--model_name log-parsing-onnx \
--model_config_file=${MORPHEUS_ROOT}/models/log-parsing-models/log-parsing-config-20220418.json \
--server_url localhost:8001
/home/coder/.conda/envs/mrc2/lib/python3.10/site-packages/tritonclient/grpc/__init__.py:54: UserWarning: Imported version of grpc is 1.46.4. There is a memory leak in certain Python GRPC versions (1.43.0 to be specific). Please use versions <1.43.0 or >=1.51.1 to avoid leaks (see https://github.com/grpc/grpc/issues/28513).
warnings.warn(
Traceback (most recent call last):
File "/workspaces/Morpheus/examples/log_parsing/run.py", line 18, in <module>
from inference import LogParsingInferenceStage
File "/workspaces/Morpheus/examples/log_parsing/inference.py", line 23, in <module>
from mrc.core import operators as ops
ModuleNotFoundError: No module named 'mrc.core'
(mrc2) coder ➜ /workspaces/Morpheus/examples/log_parsing $
This is very disappointing, especially after having similar problems running examples for RAPIDS, I’m not sure why more care is not put into preparing these examples so beginners can get their feet wet.
I posted two Github issues but recalling how the RAPIDS issue was handled with comments that felt dismissive, I’m hoping for a bit more of a hand here.
Anyways, it would be nice if you could make newcomers to Nvidia feel a bit more welcome with your examples.
At the very least, if you can’t fix this, can you outline a course of study that I can follow and possibly learn enough to contribute to your repo?
That command up there got chopped off:
python run.py \
--num_threads 1 \
--input_file ${MORPHEUS_ROOT}/models/datasets/validation-data/log-parsing-validation-data-input.csv \
--output_file ./log-parsing-output.jsonlines \
--model_vocab_hash_file=${MORPHEUS_ROOT}/morpheus/data/bert-base-cased-hash.txt \
--model_vocab_file=${MORPHEUS_ROOT}/models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
--model_seq_length=256 \
--model_name log-parsing-onnx \
--model_config_file=${MORPHEUS_ROOT}/models/log-parsing-models/log-parsing-config-20220418.json \
--server_url localhost:8001