2019-11-03 11:45:01 Starting - Starting the training job...
2019-11-03 11:45:03 Starting - Launching requested ML instances......
2019-11-03 11:46:02 Starting - Preparing the instances for training...
2019-11-03 11:46:43 Downloading - Downloading input data...
2019-11-03 11:47:26 Training - Training image download completed. Training in progress..Docker entrypoint called with argument(s): train
[11/03/2019 11:47:28 INFO 140552810366784] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/algorithm/resources/default-input.json: {u'_enable_profiler': u'false', u'_tuning_objective_metric': u'', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'_kvstore': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'true', u'epochs': u'1', u'init_method': u'random', u'local_lloyd_tol': u'0.0001', u'local_lloyd_max_iter': u'300', u'_disable_wait_to_read': u'false', u'extra_center_factor': u'auto', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'half_life_time_size': u'0', u'_num_slices': u'1'}
[11/03/2019 11:47:28 INFO 140552810366784] Reading provided configuration from /opt/ml/input/config/hyperparameters.json: {u'epochs': u'100', u'feature_dim': u'784', u'k': u'10', u'force_dense': u'True'}
[11/03/2019 11:47:28 INFO 140552810366784] Final configuration: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:28 WARNING 140552810366784] Loggers have already been setup.
[11/03/2019 11:47:28 INFO 140552810366784] Environment: {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/dc163b99-1521-4ccb-ad30-92ce3ffc3cce', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_PS_ROOT_PORT': '9000', 'DMLC_NUM_WORKER': '2', 'SAGEMAKER_HTTP_PORT': '8080', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'HOME': '/root', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-208-60.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/4055df43-a805-42f4-8085-40b6d8b6ab74', 'DMLC_ROLE': 'worker', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
Process 1 is a worker.
[11/03/2019 11:47:28 INFO 140552810366784] Using default worker.
[11/03/2019 11:47:28 INFO 140552810366784] Loaded iterator creator application/x-recordio-protobuf for content type ('application/x-recordio-protobuf', '1.0')
[11/03/2019 11:47:28 INFO 140552810366784] Create Store: dist_async
Docker entrypoint called with argument(s): train
[11/03/2019 11:47:29 INFO 140169171593024] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/algorithm/resources/default-input.json: {u'_enable_profiler': u'false', u'_tuning_objective_metric': u'', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'_kvstore': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'true', u'epochs': u'1', u'init_method': u'random', u'local_lloyd_tol': u'0.0001', u'local_lloyd_max_iter': u'300', u'_disable_wait_to_read': u'false', u'extra_center_factor': u'auto', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'half_life_time_size': u'0', u'_num_slices': u'1'}
[11/03/2019 11:47:29 INFO 140169171593024] Reading provided configuration from /opt/ml/input/config/hyperparameters.json: {u'epochs': u'100', u'feature_dim': u'784', u'k': u'10', u'force_dense': u'True'}
[11/03/2019 11:47:29 INFO 140169171593024] Final configuration: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:29 WARNING 140169171593024] Loggers have already been setup.
[11/03/2019 11:47:29 INFO 140169171593024] Launching parameter server for role scheduler
[11/03/2019 11:47:29 INFO 140169171593024] {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'PWD': '/', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] envs={'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_NUM_WORKER': '2', 'DMLC_PS_ROOT_PORT': '9000', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'DMLC_ROLE': 'scheduler', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] Launching parameter server for role server
[11/03/2019 11:47:29 INFO 140169171593024] {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'PWD': '/', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] envs={'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_NUM_WORKER': '2', 'DMLC_PS_ROOT_PORT': '9000', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'DMLC_ROLE': 'server', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] Environment: {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_PS_ROOT_PORT': '9000', 'DMLC_NUM_WORKER': '2', 'SAGEMAKER_HTTP_PORT': '8080', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'HOME': '/root', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'DMLC_ROLE': 'worker', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
Process 109 is a shell:scheduler.
Process 118 is a shell:server.
Process 1 is a worker.
[11/03/2019 11:47:29 INFO 140169171593024] Using default worker.
[11/03/2019 11:47:29 INFO 140169171593024] Loaded iterator creator application/x-recordio-protobuf for content type ('application/x-recordio-protobuf', '1.0')
[11/03/2019 11:47:29 INFO 140169171593024] Create Store: dist_async
[11/03/2019 11:47:30 INFO 140552810366784] nvidia-smi took: 0.0252320766449 secs to identify 0 gpus
[11/03/2019 11:47:30 INFO 140552810366784] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140552810366784] Setting up with params: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:30 INFO 140552810366784] 'extra_center_factor' was set to 'auto', evaluated to 10.
[11/03/2019 11:47:30 INFO 140552810366784] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140552810366784] number of center slices 1
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Batches Since Last Reset": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Records Since Last Reset": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Total Batches Seen": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Total Records Seen": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Max Records Seen Between Resets": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Reset Count": {"count": 1, "max": 0, "sum": 0.0, "min": 0}}, "EndTime": 1572781650.394244, "Dimensions": {"Host": "algo-2", "Meta": "init_train_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.394209}
[2019-11-03 11:47:30.417] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 0, "duration": 87, "num_examples": 1, "num_bytes": 15820000}
[2019-11-03 11:47:30.596] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 1, "duration": 178, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 1 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 6, "sum": 6.0, "min": 6}, "Total Records Seen": {"count": 1, "max": 30000, "sum": 30000.0, "min": 30000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 1, "sum": 1.0, "min": 1}}, "EndTime": 1572781650.596894, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 0}, "StartTime": 1572781650.417194}
[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=139020.312597 records/second
[11/03/2019 11:47:30 INFO 140169171593024] nvidia-smi took: 0.025279045105 secs to identify 0 gpus
[11/03/2019 11:47:30 INFO 140169171593024] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140169171593024] Setting up with params: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:30 INFO 140169171593024] 'extra_center_factor' was set to 'auto', evaluated to 10.
[11/03/2019 11:47:30 INFO 140169171593024] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140169171593024] number of center slices 1
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Batches Since Last Reset": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Records Since Last Reset": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Total Batches Seen": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Total Records Seen": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Max Records Seen Between Resets": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Reset Count": {"count": 1, "max": 0, "sum": 0.0, "min": 0}}, "EndTime": 1572781650.390149, "Dimensions": {"Host": "algo-1", "Meta": "init_train_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.390114}
[2019-11-03 11:47:30.413] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 0, "duration": 88, "num_examples": 1, "num_bytes": 15820000}
[2019-11-03 11:47:30.610] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 1, "duration": 196, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 1 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 6, "sum": 6.0, "min": 6}, "Total Records Seen": {"count": 1, "max": 30000, "sum": 30000.0, "min": 30000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 1, "sum": 1.0, "min": 1}}, "EndTime": 1572781650.611, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 0}, "StartTime": 1572781650.413488}
[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=126474.646596 records/second
[2019-11-03 11:47:30.732] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 3, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 2 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 11, "sum": 11.0, "min": 11}, "Total Records Seen": {"count": 1, "max": 55000, "sum": 55000.0, "min": 55000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 2, "sum": 2.0, "min": 2}}, "EndTime": 1572781650.732486, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 1}, "StartTime": 1572781650.611256}
[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206017.191414 records/second
[2019-11-03 11:47:30.853] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 5, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 3 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 16, "sum": 16.0, "min": 16}, "Total Records Seen": {"count": 1, "max": 80000, "sum": 80000.0, "min": 80000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 3, "sum": 3.0, "min": 3}}, "EndTime": 1572781650.854186, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 2}, "StartTime": 1572781650.732736}
[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=205620.877095 records/second
[2019-11-03 11:47:30.962] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 7, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 4 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 21, "sum": 21.0, "min": 21}, "Total Records Seen": {"count": 1, "max": 105000, "sum": 105000.0, "min": 105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 4, "sum": 4.0, "min": 4}}, "EndTime": 1572781650.96329, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 3}, "StartTime": 1572781650.856089}
[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=232952.697479 records/second
[2019-11-03 11:47:31.061] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 9, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 5 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 26, "sum": 26.0, "min": 26}, "Total Records Seen": {"count": 1, "max": 130000, "sum": 130000.0, "min": 130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 5, "sum": 5.0, "min": 5}}, "EndTime": 1572781651.061609, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 4}, "StartTime": 1572781650.963495}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=254481.560222 records/second
[2019-11-03 11:47:31.176] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 11, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 6 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 31, "sum": 31.0, "min": 31}, "Total Records Seen": {"count": 1, "max": 155000, "sum": 155000.0, "min": 155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 6, "sum": 6.0, "min": 6}}, "EndTime": 1572781651.177087, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 5}, "StartTime": 1572781651.061859}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216692.257301 records/second
[2019-11-03 11:47:30.711] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 3, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 2 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 11, "sum": 11.0, "min": 11}, "Total Records Seen": {"count": 1, "max": 55000, "sum": 55000.0, "min": 55000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 2, "sum": 2.0, "min": 2}}, "EndTime": 1572781650.712005, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 1}, "StartTime": 1572781650.597101}
[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=217345.775486 records/second
[2019-11-03 11:47:30.825] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 5, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 3 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 16, "sum": 16.0, "min": 16}, "Total Records Seen": {"count": 1, "max": 80000, "sum": 80000.0, "min": 80000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 3, "sum": 3.0, "min": 3}}, "EndTime": 1572781650.826047, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 2}, "StartTime": 1572781650.712255}
[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=219428.877551 records/second
[2019-11-03 11:47:30.942] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 7, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 4 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 21, "sum": 21.0, "min": 21}, "Total Records Seen": {"count": 1, "max": 105000, "sum": 105000.0, "min": 105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 4, "sum": 4.0, "min": 4}}, "EndTime": 1572781650.943047, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 3}, "StartTime": 1572781650.826549}
[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=214336.728541 records/second
[2019-11-03 11:47:31.046] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 9, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 5 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 26, "sum": 26.0, "min": 26}, "Total Records Seen": {"count": 1, "max": 130000, "sum": 130000.0, "min": 130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 5, "sum": 5.0, "min": 5}}, "EndTime": 1572781651.046523, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 4}, "StartTime": 1572781650.943299}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241870.421288 records/second
[2019-11-03 11:47:31.143] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 11, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 6 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 31, "sum": 31.0, "min": 31}, "Total Records Seen": {"count": 1, "max": 155000, "sum": 155000.0, "min": 155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 6, "sum": 6.0, "min": 6}}, "EndTime": 1572781651.144019, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 5}, "StartTime": 1572781651.046998}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=257288.957374 records/second
[2019-11-03 11:47:31.244] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 13, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 7 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 36, "sum": 36.0, "min": 36}, "Total Records Seen": {"count": 1, "max": 180000, "sum": 180000.0, "min": 180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 7, "sum": 7.0, "min": 7}}, "EndTime": 1572781651.244924, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 6}, "StartTime": 1572781651.144272}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=248028.100718 records/second
[2019-11-03 11:47:31.344] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 15, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 8 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 41, "sum": 41.0, "min": 41}, "Total Records Seen": {"count": 1, "max": 205000, "sum": 205000.0, "min": 205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 8, "sum": 8.0, "min": 8}}, "EndTime": 1572781651.345334, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 7}, "StartTime": 1572781651.245178}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=249264.503124 records/second
[2019-11-03 11:47:31.437] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 17, "duration": 91, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 9 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 46, "sum": 46.0, "min": 46}, "Total Records Seen": {"count": 1, "max": 230000, "sum": 230000.0, "min": 230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 9, "sum": 9.0, "min": 9}}, "EndTime": 1572781651.437796, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 8}, "StartTime": 1572781651.345584}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=270515.787328 records/second
[2019-11-03 11:47:31.544] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 19, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 10 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 51, "sum": 51.0, "min": 51}, "Total Records Seen": {"count": 1, "max": 255000, "sum": 255000.0, "min": 255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 10, "sum": 10.0, "min": 10}}, "EndTime": 1572781651.54472, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 9}, "StartTime": 1572781651.438118}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=234205.612486 records/second
[2019-11-03 11:47:31.299] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 13, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 7 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 36, "sum": 36.0, "min": 36}, "Total Records Seen": {"count": 1, "max": 180000, "sum": 180000.0, "min": 180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 7, "sum": 7.0, "min": 7}}, "EndTime": 1572781651.300212, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 6}, "StartTime": 1572781651.179075}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206112.356017 records/second
[2019-11-03 11:47:31.417] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 15, "duration": 117, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 8 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 41, "sum": 41.0, "min": 41}, "Total Records Seen": {"count": 1, "max": 205000, "sum": 205000.0, "min": 205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 8, "sum": 8.0, "min": 8}}, "EndTime": 1572781651.418261, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 7}, "StartTime": 1572781651.300484}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=212013.425533 records/second
[2019-11-03 11:47:31.537] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 17, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 9 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 46, "sum": 46.0, "min": 46}, "Total Records Seen": {"count": 1, "max": 230000, "sum": 230000.0, "min": 230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 9, "sum": 9.0, "min": 9}}, "EndTime": 1572781651.537545, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 8}, "StartTime": 1572781651.41851}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209763.02597 records/second
[2019-11-03 11:47:31.659] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 19, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 10 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 51, "sum": 51.0, "min": 51}, "Total Records Seen": {"count": 1, "max": 255000, "sum": 255000.0, "min": 255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 10, "sum": 10.0, "min": 10}}, "EndTime": 1572781651.659652, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 9}, "StartTime": 1572781651.539169}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=207255.491823 records/second
[2019-11-03 11:47:31.766] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 21, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 11 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 56, "sum": 56.0, "min": 56}, "Total Records Seen": {"count": 1, "max": 280000, "sum": 280000.0, "min": 280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 11, "sum": 11.0, "min": 11}}, "EndTime": 1572781651.766884, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 10}, "StartTime": 1572781651.66}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233601.411532 records/second
[2019-11-03 11:47:31.880] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 23, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 12 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 61, "sum": 61.0, "min": 61}, "Total Records Seen": {"count": 1, "max": 305000, "sum": 305000.0, "min": 305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 12, "sum": 12.0, "min": 12}}, "EndTime": 1572781651.882341, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 11}, "StartTime": 1572781651.767134}
[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216771.099282 records/second
[2019-11-03 11:47:32.005] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 25, "duration": 122, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 13 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 66, "sum": 66.0, "min": 66}, "Total Records Seen": {"count": 1, "max": 330000, "sum": 330000.0, "min": 330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 13, "sum": 13.0, "min": 13}}, "EndTime": 1572781652.006303, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 12}, "StartTime": 1572781651.882572}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=201845.642106 records/second
[2019-11-03 11:47:32.126] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 27, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 14 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 71, "sum": 71.0, "min": 71}, "Total Records Seen": {"count": 1, "max": 355000, "sum": 355000.0, "min": 355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 14, "sum": 14.0, "min": 14}}, "EndTime": 1572781652.12742, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 13}, "StartTime": 1572781652.006544}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206517.890027 records/second
[2019-11-03 11:47:31.671] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 21, "duration": 126, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 11 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 56, "sum": 56.0, "min": 56}, "Total Records Seen": {"count": 1, "max": 280000, "sum": 280000.0, "min": 280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 11, "sum": 11.0, "min": 11}}, "EndTime": 1572781651.671943, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 10}, "StartTime": 1572781651.544972}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=196678.558433 records/second
[2019-11-03 11:47:31.777] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 23, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 12 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 61, "sum": 61.0, "min": 61}, "Total Records Seen": {"count": 1, "max": 305000, "sum": 305000.0, "min": 305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 12, "sum": 12.0, "min": 12}}, "EndTime": 1572781651.778138, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 11}, "StartTime": 1572781651.672195}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=235702.324034 records/second
[2019-11-03 11:47:31.885] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 25, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 13 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 66, "sum": 66.0, "min": 66}, "Total Records Seen": {"count": 1, "max": 330000, "sum": 330000.0, "min": 330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 13, "sum": 13.0, "min": 13}}, "EndTime": 1572781651.885934, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 12}, "StartTime": 1572781651.778343}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=232080.829543 records/second
[2019-11-03 11:47:31.995] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 27, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 14 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 71, "sum": 71.0, "min": 71}, "Total Records Seen": {"count": 1, "max": 355000, "sum": 355000.0, "min": 355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 14, "sum": 14.0, "min": 14}}, "EndTime": 1572781651.996102, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 13}, "StartTime": 1572781651.887734}
[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=230376.771092 records/second
[2019-11-03 11:47:32.102] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 29, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 15 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 76, "sum": 76.0, "min": 76}, "Total Records Seen": {"count": 1, "max": 380000, "sum": 380000.0, "min": 380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 15, "sum": 15.0, "min": 15}}, "EndTime": 1572781652.102514, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 14}, "StartTime": 1572781651.998441}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=239888.906428 records/second
[2019-11-03 11:47:32.208] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 31, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 16 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 81, "sum": 81.0, "min": 81}, "Total Records Seen": {"count": 1, "max": 405000, "sum": 405000.0, "min": 405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 16, "sum": 16.0, "min": 16}}, "EndTime": 1572781652.209094, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 15}, "StartTime": 1572781652.10273}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=234803.482498 records/second
[2019-11-03 11:47:32.323] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 33, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 17 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 86, "sum": 86.0, "min": 86}, "Total Records Seen": {"count": 1, "max": 430000, "sum": 430000.0, "min": 430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 17, "sum": 17.0, "min": 17}}, "EndTime": 1572781652.324047, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 16}, "StartTime": 1572781652.210925}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=220762.137353 records/second
[2019-11-03 11:47:32.416] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 35, "duration": 90, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 18 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 91, "sum": 91.0, "min": 91}, "Total Records Seen": {"count": 1, "max": 455000, "sum": 455000.0, "min": 455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 18, "sum": 18.0, "min": 18}}, "EndTime": 1572781652.417163, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 17}, "StartTime": 1572781652.325707}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=272922.387384 records/second
[2019-11-03 11:47:32.526] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 37, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 19 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 96, "sum": 96.0, "min": 96}, "Total Records Seen": {"count": 1, "max": 480000, "sum": 480000.0, "min": 480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 19, "sum": 19.0, "min": 19}}, "EndTime": 1572781652.527196, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 18}, "StartTime": 1572781652.417384}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=227432.165709 records/second
[2019-11-03 11:47:32.626] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 39, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 20 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 101, "sum": 101.0, "min": 101}, "Total Records Seen": {"count": 1, "max": 505000, "sum": 505000.0, "min": 505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 20, "sum": 20.0, "min": 20}}, "EndTime": 1572781652.62669, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 19}, "StartTime": 1572781652.528697}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254831.607036 records/second
[2019-11-03 11:47:32.248] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 29, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 15 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 76, "sum": 76.0, "min": 76}, "Total Records Seen": {"count": 1, "max": 380000, "sum": 380000.0, "min": 380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 15, "sum": 15.0, "min": 15}}, "EndTime": 1572781652.249401, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 14}, "StartTime": 1572781652.127715}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=205183.516847 records/second
[2019-11-03 11:47:32.377] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 31, "duration": 125, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 16 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 81, "sum": 81.0, "min": 81}, "Total Records Seen": {"count": 1, "max": 405000, "sum": 405000.0, "min": 405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 16, "sum": 16.0, "min": 16}}, "EndTime": 1572781652.378822, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 15}, "StartTime": 1572781652.2497}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=193302.289226 records/second
[2019-11-03 11:47:32.496] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 33, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 17 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 86, "sum": 86.0, "min": 86}, "Total Records Seen": {"count": 1, "max": 430000, "sum": 430000.0, "min": 430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 17, "sum": 17.0, "min": 17}}, "EndTime": 1572781652.496576, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 16}, "StartTime": 1572781652.379179}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=212693.332035 records/second
[2019-11-03 11:47:32.615] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 35, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 18 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 91, "sum": 91.0, "min": 91}, "Total Records Seen": {"count": 1, "max": 455000, "sum": 455000.0, "min": 455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 18, "sum": 18.0, "min": 18}}, "EndTime": 1572781652.616174, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 17}, "StartTime": 1572781652.496823}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209236.884482 records/second
[2019-11-03 11:47:32.737] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 37, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 19 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 96, "sum": 96.0, "min": 96}, "Total Records Seen": {"count": 1, "max": 480000, "sum": 480000.0, "min": 480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 19, "sum": 19.0, "min": 19}}, "EndTime": 1572781652.738183, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 18}, "StartTime": 1572781652.616413}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=205061.132536 records/second
[2019-11-03 11:47:32.857] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 39, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 20 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 101, "sum": 101.0, "min": 101}, "Total Records Seen": {"count": 1, "max": 505000, "sum": 505000.0, "min": 505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 20, "sum": 20.0, "min": 20}}, "EndTime": 1572781652.858275, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 19}, "StartTime": 1572781652.738444}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208381.972142 records/second
[2019-11-03 11:47:32.966] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 41, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 21 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 106, "sum": 106.0, "min": 106}, "Total Records Seen": {"count": 1, "max": 530000, "sum": 530000.0, "min": 530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 21, "sum": 21.0, "min": 21}}, "EndTime": 1572781652.966966, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 20}, "StartTime": 1572781652.858526}
[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=229766.962848 records/second
[2019-11-03 11:47:33.074] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 43, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 22 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 111, "sum": 111.0, "min": 111}, "Total Records Seen": {"count": 1, "max": 555000, "sum": 555000.0, "min": 555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 22, "sum": 22.0, "min": 22}}, "EndTime": 1572781653.075226, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 21}, "StartTime": 1572781652.967602}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=231852.986895 records/second
[2019-11-03 11:47:33.183] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 45, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 23 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 116, "sum": 116.0, "min": 116}, "Total Records Seen": {"count": 1, "max": 580000, "sum": 580000.0, "min": 580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 23, "sum": 23.0, "min": 23}}, "EndTime": 1572781653.183966, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 22}, "StartTime": 1572781653.075554}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=230295.815882 records/second
[2019-11-03 11:47:32.733] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 41, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 21 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 106, "sum": 106.0, "min": 106}, "Total Records Seen": {"count": 1, "max": 530000, "sum": 530000.0, "min": 530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 21, "sum": 21.0, "min": 21}}, "EndTime": 1572781652.733428, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 20}, "StartTime": 1572781652.626892}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=234418.188728 records/second
[2019-11-03 11:47:32.852] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 43, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 22 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 111, "sum": 111.0, "min": 111}, "Total Records Seen": {"count": 1, "max": 555000, "sum": 555000.0, "min": 555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 22, "sum": 22.0, "min": 22}}, "EndTime": 1572781652.85269, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 21}, "StartTime": 1572781652.73363}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=209777.294079 records/second
[2019-11-03 11:47:32.963] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 45, "duration": 110, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 23 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 116, "sum": 116.0, "min": 116}, "Total Records Seen": {"count": 1, "max": 580000, "sum": 580000.0, "min": 580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 23, "sum": 23.0, "min": 23}}, "EndTime": 1572781652.963672, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 22}, "StartTime": 1572781652.852899}
[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=225459.002118 records/second
[2019-11-03 11:47:33.079] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 47, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 24 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 121, "sum": 121.0, "min": 121}, "Total Records Seen": {"count": 1, "max": 605000, "sum": 605000.0, "min": 605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 24, "sum": 24.0, "min": 24}}, "EndTime": 1572781653.080286, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 23}, "StartTime": 1572781652.963875}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=214553.817697 records/second
[2019-11-03 11:47:33.194] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 49, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 25 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 126, "sum": 126.0, "min": 126}, "Total Records Seen": {"count": 1, "max": 630000, "sum": 630000.0, "min": 630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 25, "sum": 25.0, "min": 25}}, "EndTime": 1572781653.194447, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 24}, "StartTime": 1572781653.080736}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=219639.386018 records/second
[2019-11-03 11:47:33.304] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 51, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 26 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 131, "sum": 131.0, "min": 131}, "Total Records Seen": {"count": 1, "max": 655000, "sum": 655000.0, "min": 655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 26, "sum": 26.0, "min": 26}}, "EndTime": 1572781653.304679, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 25}, "StartTime": 1572781653.194648}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=226936.503505 records/second
[2019-11-03 11:47:33.408] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 53, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 27 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 136, "sum": 136.0, "min": 136}, "Total Records Seen": {"count": 1, "max": 680000, "sum": 680000.0, "min": 680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 27, "sum": 27.0, "min": 27}}, "EndTime": 1572781653.408885, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 26}, "StartTime": 1572781653.305177}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=240719.926538 records/second
[2019-11-03 11:47:33.506] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 55, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 28 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 141, "sum": 141.0, "min": 141}, "Total Records Seen": {"count": 1, "max": 705000, "sum": 705000.0, "min": 705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 28, "sum": 28.0, "min": 28}}, "EndTime": 1572781653.507087, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 27}, "StartTime": 1572781653.409142}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254881.161309 records/second
[2019-11-03 11:47:33.616] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 57, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 29 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 146, "sum": 146.0, "min": 146}, "Total Records Seen": {"count": 1, "max": 730000, "sum": 730000.0, "min": 730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 29, "sum": 29.0, "min": 29}}, "EndTime": 1572781653.616641, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 28}, "StartTime": 1572781653.507342}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=228445.939469 records/second
[2019-11-03 11:47:33.285] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 47, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 24 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 121, "sum": 121.0, "min": 121}, "Total Records Seen": {"count": 1, "max": 605000, "sum": 605000.0, "min": 605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 24, "sum": 24.0, "min": 24}}, "EndTime": 1572781653.285798, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 23}, "StartTime": 1572781653.184255}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=245898.125588 records/second
[2019-11-03 11:47:33.395] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 49, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 25 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 126, "sum": 126.0, "min": 126}, "Total Records Seen": {"count": 1, "max": 630000, "sum": 630000.0, "min": 630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 25, "sum": 25.0, "min": 25}}, "EndTime": 1572781653.395731, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 24}, "StartTime": 1572781653.287535}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=230465.887587 records/second
[2019-11-03 11:47:33.507] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 51, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 26 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 131, "sum": 131.0, "min": 131}, "Total Records Seen": {"count": 1, "max": 655000, "sum": 655000.0, "min": 655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 26, "sum": 26.0, "min": 26}}, "EndTime": 1572781653.507964, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 25}, "StartTime": 1572781653.396171}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=223359.803688 records/second
[2019-11-03 11:47:33.614] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 53, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 27 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 136, "sum": 136.0, "min": 136}, "Total Records Seen": {"count": 1, "max": 680000, "sum": 680000.0, "min": 680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 27, "sum": 27.0, "min": 27}}, "EndTime": 1572781653.615178, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 26}, "StartTime": 1572781653.508207}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233372.133136 records/second
[2019-11-03 11:47:33.734] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 55, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 28 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 141, "sum": 141.0, "min": 141}, "Total Records Seen": {"count": 1, "max": 705000, "sum": 705000.0, "min": 705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 28, "sum": 28.0, "min": 28}}, "EndTime": 1572781653.735415, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 27}, "StartTime": 1572781653.615447}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208149.086275 records/second
[2019-11-03 11:47:33.843] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 57, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 29 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 146, "sum": 146.0, "min": 146}, "Total Records Seen": {"count": 1, "max": 730000, "sum": 730000.0, "min": 730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 29, "sum": 29.0, "min": 29}}, "EndTime": 1572781653.843744, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 28}, "StartTime": 1572781653.737312}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=234438.104777 records/second
[2019-11-03 11:47:33.945] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 59, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 30 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 151, "sum": 151.0, "min": 151}, "Total Records Seen": {"count": 1, "max": 755000, "sum": 755000.0, "min": 755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 30, "sum": 30.0, "min": 30}}, "EndTime": 1572781653.946883, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 29}, "StartTime": 1572781653.845438}
[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=246162.514174 records/second
[2019-11-03 11:47:34.066] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 61, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 31 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 156, "sum": 156.0, "min": 156}, "Total Records Seen": {"count": 1, "max": 780000, "sum": 780000.0, "min": 780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 31, "sum": 31.0, "min": 31}}, "EndTime": 1572781654.067035, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 30}, "StartTime": 1572781653.94709}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208220.592586 records/second
[2019-11-03 11:47:34.171] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 63, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 32 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 161, "sum": 161.0, "min": 161}, "Total Records Seen": {"count": 1, "max": 805000, "sum": 805000.0, "min": 805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 32, "sum": 32.0, "min": 32}}, "EndTime": 1572781654.171523, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 31}, "StartTime": 1572781654.068661}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=242749.526574 records/second
[2019-11-03 11:47:33.717] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 59, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 30 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 151, "sum": 151.0, "min": 151}, "Total Records Seen": {"count": 1, "max": 755000, "sum": 755000.0, "min": 755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 30, "sum": 30.0, "min": 30}}, "EndTime": 1572781653.71781, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 29}, "StartTime": 1572781653.616888}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=247224.029801 records/second
[2019-11-03 11:47:33.821] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 61, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 31 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 156, "sum": 156.0, "min": 156}, "Total Records Seen": {"count": 1, "max": 780000, "sum": 780000.0, "min": 780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 31, "sum": 31.0, "min": 31}}, "EndTime": 1572781653.821933, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 30}, "StartTime": 1572781653.718127}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=240509.56349 records/second
[2019-11-03 11:47:33.916] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 63, "duration": 94, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 32 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 161, "sum": 161.0, "min": 161}, "Total Records Seen": {"count": 1, "max": 805000, "sum": 805000.0, "min": 805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 32, "sum": 32.0, "min": 32}}, "EndTime": 1572781653.916884, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 31}, "StartTime": 1572781653.822185}
[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=263612.98335 records/second
[2019-11-03 11:47:34.010] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 65, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 33 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 166, "sum": 166.0, "min": 166}, "Total Records Seen": {"count": 1, "max": 830000, "sum": 830000.0, "min": 830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 33, "sum": 33.0, "min": 33}}, "EndTime": 1572781654.011389, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 32}, "StartTime": 1572781653.917124}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=264847.430143 records/second
[2019-11-03 11:47:34.105] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 67, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 34 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 171, "sum": 171.0, "min": 171}, "Total Records Seen": {"count": 1, "max": 855000, "sum": 855000.0, "min": 855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 34, "sum": 34.0, "min": 34}}, "EndTime": 1572781654.106247, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 33}, "StartTime": 1572781654.011634}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=263838.502784 records/second
[2019-11-03 11:47:34.205] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 69, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 35 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 176, "sum": 176.0, "min": 176}, "Total Records Seen": {"count": 1, "max": 880000, "sum": 880000.0, "min": 880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 35, "sum": 35.0, "min": 35}}, "EndTime": 1572781654.205973, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 34}, "StartTime": 1572781654.106499}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=250973.784295 records/second
[2019-11-03 11:47:34.306] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 71, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 36 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 181, "sum": 181.0, "min": 181}, "Total Records Seen": {"count": 1, "max": 905000, "sum": 905000.0, "min": 905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 36, "sum": 36.0, "min": 36}}, "EndTime": 1572781654.306714, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 35}, "StartTime": 1572781654.206226}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=248451.820189 records/second
[2019-11-03 11:47:34.400] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 73, "duration": 91, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 37 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 186, "sum": 186.0, "min": 186}, "Total Records Seen": {"count": 1, "max": 930000, "sum": 930000.0, "min": 930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 37, "sum": 37.0, "min": 37}}, "EndTime": 1572781654.400918, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 36}, "StartTime": 1572781654.308464}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=269996.163423 records/second
[2019-11-03 11:47:34.509] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 75, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 38 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 191, "sum": 191.0, "min": 191}, "Total Records Seen": {"count": 1, "max": 955000, "sum": 955000.0, "min": 955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 38, "sum": 38.0, "min": 38}}, "EndTime": 1572781654.50983, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 37}, "StartTime": 1572781654.402811}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=233186.856197 records/second
[2019-11-03 11:47:34.606] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 77, "duration": 94, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 39 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 196, "sum": 196.0, "min": 196}, "Total Records Seen": {"count": 1, "max": 980000, "sum": 980000.0, "min": 980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 39, "sum": 39.0, "min": 39}}, "EndTime": 1572781654.606595, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 38}, "StartTime": 1572781654.511681}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=263049.548069 records/second
[2019-11-03 11:47:34.280] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 65, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 33 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 166, "sum": 166.0, "min": 166}, "Total Records Seen": {"count": 1, "max": 830000, "sum": 830000.0, "min": 830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 33, "sum": 33.0, "min": 33}}, "EndTime": 1572781654.280882, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 32}, "StartTime": 1572781654.171763}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=228820.324144 records/second
[2019-11-03 11:47:34.382] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 67, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 34 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 171, "sum": 171.0, "min": 171}, "Total Records Seen": {"count": 1, "max": 855000, "sum": 855000.0, "min": 855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 34, "sum": 34.0, "min": 34}}, "EndTime": 1572781654.3829, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 33}, "StartTime": 1572781654.281128}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=245078.566704 records/second
[2019-11-03 11:47:34.495] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 69, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 35 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 176, "sum": 176.0, "min": 176}, "Total Records Seen": {"count": 1, "max": 880000, "sum": 880000.0, "min": 880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 35, "sum": 35.0, "min": 35}}, "EndTime": 1572781654.495952, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 34}, "StartTime": 1572781654.383263}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=221599.585786 records/second
[2019-11-03 11:47:34.602] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 71, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 36 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 181, "sum": 181.0, "min": 181}, "Total Records Seen": {"count": 1, "max": 905000, "sum": 905000.0, "min": 905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 36, "sum": 36.0, "min": 36}}, "EndTime": 1572781654.603045, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 35}, "StartTime": 1572781654.496192}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233684.187068 records/second
[2019-11-03 11:47:34.716] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 73, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 37 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 186, "sum": 186.0, "min": 186}, "Total Records Seen": {"count": 1, "max": 930000, "sum": 930000.0, "min": 930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 37, "sum": 37.0, "min": 37}}, "EndTime": 1572781654.716717, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 36}, "StartTime": 1572781654.603287}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=220101.342133 records/second
[2019-11-03 11:47:34.835] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 75, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 38 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 191, "sum": 191.0, "min": 191}, "Total Records Seen": {"count": 1, "max": 955000, "sum": 955000.0, "min": 955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 38, "sum": 38.0, "min": 38}}, "EndTime": 1572781654.836292, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 37}, "StartTime": 1572781654.716984}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209318.750287 records/second
[2019-11-03 11:47:34.942] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 77, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 39 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 196, "sum": 196.0, "min": 196}, "Total Records Seen": {"count": 1, "max": 980000, "sum": 980000.0, "min": 980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 39, "sum": 39.0, "min": 39}}, "EndTime": 1572781654.942638, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 38}, "StartTime": 1572781654.836537}
[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235349.463572 records/second
[2019-11-03 11:47:35.055] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 79, "duration": 110, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 40 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 201, "sum": 201.0, "min": 201}, "Total Records Seen": {"count": 1, "max": 1005000, "sum": 1005000.0, "min": 1005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 40, "sum": 40.0, "min": 40}}, "EndTime": 1572781655.055605, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 39}, "StartTime": 1572781654.944466}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=224686.993098 records/second
[2019-11-03 11:47:35.176] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 81, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 41 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 206, "sum": 206.0, "min": 206}, "Total Records Seen": {"count": 1, "max": 1030000, "sum": 1030000.0, "min": 1030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 41, "sum": 41.0, "min": 41}}, "EndTime": 1572781655.177454, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 40}, "StartTime": 1572781655.057545}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208179.253468 records/second
[2019-11-03 11:47:34.712] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 79, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 40 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 201, "sum": 201.0, "min": 201}, "Total Records Seen": {"count": 1, "max": 1005000, "sum": 1005000.0, "min": 1005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 40, "sum": 40.0, "min": 40}}, "EndTime": 1572781654.712592, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 39}, "StartTime": 1572781654.606835}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=236098.233163 records/second
[2019-11-03 11:47:34.811] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 81, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 41 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 206, "sum": 206.0, "min": 206}, "Total Records Seen": {"count": 1, "max": 1030000, "sum": 1030000.0, "min": 1030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 41, "sum": 41.0, "min": 41}}, "EndTime": 1572781654.811942, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 40}, "StartTime": 1572781654.712832}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=251942.229281 records/second
[2019-11-03 11:47:34.908] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 83, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 42 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 211, "sum": 211.0, "min": 211}, "Total Records Seen": {"count": 1, "max": 1055000, "sum": 1055000.0, "min": 1055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 42, "sum": 42.0, "min": 42}}, "EndTime": 1572781654.909221, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 41}, "StartTime": 1572781654.812146}
[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=257259.920411 records/second
[2019-11-03 11:47:35.013] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 85, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 43 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 216, "sum": 216.0, "min": 216}, "Total Records Seen": {"count": 1, "max": 1080000, "sum": 1080000.0, "min": 1080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 43, "sum": 43.0, "min": 43}}, "EndTime": 1572781655.014094, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 42}, "StartTime": 1572781654.909413}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=238581.673887 records/second
[2019-11-03 11:47:35.110] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 87, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 44 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 221, "sum": 221.0, "min": 221}, "Total Records Seen": {"count": 1, "max": 1105000, "sum": 1105000.0, "min": 1105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 44, "sum": 44.0, "min": 44}}, "EndTime": 1572781655.110599, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 43}, "StartTime": 1572781655.01429}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=259303.973233 records/second
[2019-11-03 11:47:35.203] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 89, "duration": 92, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 45 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 226, "sum": 226.0, "min": 226}, "Total Records Seen": {"count": 1, "max": 1130000, "sum": 1130000.0, "min": 1130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 45, "sum": 45.0, "min": 45}}, "EndTime": 1572781655.203846, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 44}, "StartTime": 1572781655.11079}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=268355.078287 records/second
[2019-11-03 11:47:35.301] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 91, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 46 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 231, "sum": 231.0, "min": 231}, "Total Records Seen": {"count": 1, "max": 1155000, "sum": 1155000.0, "min": 1155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 46, "sum": 46.0, "min": 46}}, "EndTime": 1572781655.301512, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 45}, "StartTime": 1572781655.204047}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=256218.311012 records/second
[2019-11-03 11:47:35.402] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 93, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 47 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 236, "sum": 236.0, "min": 236}, "Total Records Seen": {"count": 1, "max": 1180000, "sum": 1180000.0, "min": 1180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 47, "sum": 47.0, "min": 47}}, "EndTime": 1572781655.402659, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 46}, "StartTime": 1572781655.301705}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=247390.26318 records/second
[2019-11-03 11:47:35.497] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 95, "duration": 94, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 48 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 241, "sum": 241.0, "min": 241}, "Total Records Seen": {"count": 1, "max": 1205000, "sum": 1205000.0, "min": 1205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 48, "sum": 48.0, "min": 48}}, "EndTime": 1572781655.498125, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 47}, "StartTime": 1572781655.402851}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=262123.030158 records/second
[2019-11-03 11:47:35.597] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 97, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 49 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 246, "sum": 246.0, "min": 246}, "Total Records Seen": {"count": 1, "max": 1230000, "sum": 1230000.0, "min": 1230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 49, "sum": 49.0, "min": 49}}, "EndTime": 1572781655.597852, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 48}, "StartTime": 1572781655.499737}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254351.928665 records/second
[2019-11-03 11:47:35.291] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 83, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 42 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 211, "sum": 211.0, "min": 211}, "Total Records Seen": {"count": 1, "max": 1055000, "sum": 1055000.0, "min": 1055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 42, "sum": 42.0, "min": 42}}, "EndTime": 1572781655.291625, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 41}, "StartTime": 1572781655.179141}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=221949.970155 records/second
[2019-11-03 11:47:35.408] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 85, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 43 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 216, "sum": 216.0, "min": 216}, "Total Records Seen": {"count": 1, "max": 1080000, "sum": 1080000.0, "min": 1080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 43, "sum": 43.0, "min": 43}}, "EndTime": 1572781655.408899, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 42}, "StartTime": 1572781655.291992}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=213604.075804 records/second
[2019-11-03 11:47:35.513] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 87, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 44 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 221, "sum": 221.0, "min": 221}, "Total Records Seen": {"count": 1, "max": 1105000, "sum": 1105000.0, "min": 1105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 44, "sum": 44.0, "min": 44}}, "EndTime": 1572781655.513764, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 43}, "StartTime": 1572781655.40914}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=238626.738367 records/second
[2019-11-03 11:47:35.637] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 89, "duration": 121, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 45 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 226, "sum": 226.0, "min": 226}, "Total Records Seen": {"count": 1, "max": 1130000, "sum": 1130000.0, "min": 1130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 45, "sum": 45.0, "min": 45}}, "EndTime": 1572781655.637705, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 44}, "StartTime": 1572781655.514016}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=201894.610373 records/second
[2019-11-03 11:47:35.743] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 91, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 46 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 231, "sum": 231.0, "min": 231}, "Total Records Seen": {"count": 1, "max": 1155000, "sum": 1155000.0, "min": 1155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 46, "sum": 46.0, "min": 46}}, "EndTime": 1572781655.744241, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 45}, "StartTime": 1572781655.637956}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=234933.94992 records/second
[2019-11-03 11:47:35.854] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 93, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 47 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 236, "sum": 236.0, "min": 236}, "Total Records Seen": {"count": 1, "max": 1180000, "sum": 1180000.0, "min": 1180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 47, "sum": 47.0, "min": 47}}, "EndTime": 1572781655.854611, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 46}, "StartTime": 1572781655.744479}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226738.744973 records/second
[2019-11-03 11:47:35.963] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 95, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 48 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 241, "sum": 241.0, "min": 241}, "Total Records Seen": {"count": 1, "max": 1205000, "sum": 1205000.0, "min": 1205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 48, "sum": 48.0, "min": 48}}, "EndTime": 1572781655.96422, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 47}, "StartTime": 1572781655.854849}
[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=228308.159928 records/second
[2019-11-03 11:47:36.080] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 97, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 49 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 246, "sum": 246.0, "min": 246}, "Total Records Seen": {"count": 1, "max": 1230000, "sum": 1230000.0, "min": 1230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 49, "sum": 49.0, "min": 49}}, "EndTime": 1572781656.081254, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 48}, "StartTime": 1572781655.964466}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=213826.658999 records/second
[2019-11-03 11:47:36.195] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 99, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 50 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 251, "sum": 251.0, "min": 251}, "Total Records Seen": {"count": 1, "max": 1255000, "sum": 1255000.0, "min": 1255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 50, "sum": 50.0, "min": 50}}, "EndTime": 1572781656.196875, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 49}, "StartTime": 1572781656.081495}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216472.608961 records/second
[2019-11-03 11:47:35.695] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 99, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 50 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 251, "sum": 251.0, "min": 251}, "Total Records Seen": {"count": 1, "max": 1255000, "sum": 1255000.0, "min": 1255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 50, "sum": 50.0, "min": 50}}, "EndTime": 1572781655.69541, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 49}, "StartTime": 1572781655.599632}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=260737.322147 records/second
[2019-11-03 11:47:35.811] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 101, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 51 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 256, "sum": 256.0, "min": 256}, "Total Records Seen": {"count": 1, "max": 1280000, "sum": 1280000.0, "min": 1280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 51, "sum": 51.0, "min": 51}}, "EndTime": 1572781655.811704, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 50}, "StartTime": 1572781655.697071}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=217885.922078 records/second
[2019-11-03 11:47:35.926] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 103, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 52 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 261, "sum": 261.0, "min": 261}, "Total Records Seen": {"count": 1, "max": 1305000, "sum": 1305000.0, "min": 1305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 52, "sum": 52.0, "min": 52}}, "EndTime": 1572781655.926947, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 51}, "StartTime": 1572781655.813594}
[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=220338.142528 records/second
[2019-11-03 11:47:36.025] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 105, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 53 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 266, "sum": 266.0, "min": 266}, "Total Records Seen": {"count": 1, "max": 1330000, "sum": 1330000.0, "min": 1330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 53, "sum": 53.0, "min": 53}}, "EndTime": 1572781656.026239, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 52}, "StartTime": 1572781655.928995}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=256794.960963 records/second
[2019-11-03 11:47:36.130] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 107, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 54 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 271, "sum": 271.0, "min": 271}, "Total Records Seen": {"count": 1, "max": 1355000, "sum": 1355000.0, "min": 1355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 54, "sum": 54.0, "min": 54}}, "EndTime": 1572781656.131204, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 53}, "StartTime": 1572781656.027974}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241924.550851 records/second
[2019-11-03 11:47:36.221] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 109, "duration": 90, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 55 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 276, "sum": 276.0, "min": 276}, "Total Records Seen": {"count": 1, "max": 1380000, "sum": 1380000.0, "min": 1380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 55, "sum": 55.0, "min": 55}}, "EndTime": 1572781656.222293, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 54}, "StartTime": 1572781656.131432}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=274815.754437 records/second
[2019-11-03 11:47:36.316] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 111, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 56 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 281, "sum": 281.0, "min": 281}, "Total Records Seen": {"count": 1, "max": 1405000, "sum": 1405000.0, "min": 1405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 56, "sum": 56.0, "min": 56}}, "EndTime": 1572781656.317027, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 55}, "StartTime": 1572781656.222511}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=264206.794548 records/second
[2019-11-03 11:47:36.408] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 113, "duration": 89, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 57 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 286, "sum": 286.0, "min": 286}, "Total Records Seen": {"count": 1, "max": 1430000, "sum": 1430000.0, "min": 1430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 57, "sum": 57.0, "min": 57}}, "EndTime": 1572781656.40873, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 56}, "StartTime": 1572781656.318606}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=277057.301919 records/second
[2019-11-03 11:47:36.506] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 115, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 58 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 291, "sum": 291.0, "min": 291}, "Total Records Seen": {"count": 1, "max": 1455000, "sum": 1455000.0, "min": 1455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 58, "sum": 58.0, "min": 58}}, "EndTime": 1572781656.507134, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 57}, "StartTime": 1572781656.408955}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254313.064948 records/second
[2019-11-03 11:47:36.611] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 117, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 59 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 296, "sum": 296.0, "min": 296}, "Total Records Seen": {"count": 1, "max": 1480000, "sum": 1480000.0, "min": 1480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 59, "sum": 59.0, "min": 59}}, "EndTime": 1572781656.61187, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 58}, "StartTime": 1572781656.508761}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=242210.667584 records/second
[2019-11-03 11:47:36.315] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 101, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 51 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 256, "sum": 256.0, "min": 256}, "Total Records Seen": {"count": 1, "max": 1280000, "sum": 1280000.0, "min": 1280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 51, "sum": 51.0, "min": 51}}, "EndTime": 1572781656.316061, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 50}, "StartTime": 1572781656.19707}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209839.005013 records/second
[2019-11-03 11:47:36.435] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 103, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 52 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 261, "sum": 261.0, "min": 261}, "Total Records Seen": {"count": 1, "max": 1305000, "sum": 1305000.0, "min": 1305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 52, "sum": 52.0, "min": 52}}, "EndTime": 1572781656.435687, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 51}, "StartTime": 1572781656.316337}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209205.157825 records/second
[2019-11-03 11:47:36.550] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 105, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 53 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 266, "sum": 266.0, "min": 266}, "Total Records Seen": {"count": 1, "max": 1330000, "sum": 1330000.0, "min": 1330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 53, "sum": 53.0, "min": 53}}, "EndTime": 1572781656.550555, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 52}, "StartTime": 1572781656.436036}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=218056.742633 records/second
[2019-11-03 11:47:36.667] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 107, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 54 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 271, "sum": 271.0, "min": 271}, "Total Records Seen": {"count": 1, "max": 1355000, "sum": 1355000.0, "min": 1355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 54, "sum": 54.0, "min": 54}}, "EndTime": 1572781656.668235, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 53}, "StartTime": 1572781656.550799}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=212650.198033 records/second
[2019-11-03 11:47:36.783] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 109, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 55 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 276, "sum": 276.0, "min": 276}, "Total Records Seen": {"count": 1, "max": 1380000, "sum": 1380000.0, "min": 1380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 55, "sum": 55.0, "min": 55}}, "EndTime": 1572781656.783669, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 54}, "StartTime": 1572781656.668477}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216786.336732 records/second
[2019-11-03 11:47:36.893] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 111, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 56 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 281, "sum": 281.0, "min": 281}, "Total Records Seen": {"count": 1, "max": 1405000, "sum": 1405000.0, "min": 1405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 56, "sum": 56.0, "min": 56}}, "EndTime": 1572781656.894322, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 55}, "StartTime": 1572781656.783985}
[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226315.925788 records/second
[2019-11-03 11:47:37.002] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 113, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 57 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 286, "sum": 286.0, "min": 286}, "Total Records Seen": {"count": 1, "max": 1430000, "sum": 1430000.0, "min": 1430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 57, "sum": 57.0, "min": 57}}, "EndTime": 1572781657.002531, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 56}, "StartTime": 1572781656.894559}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=231273.599887 records/second
[2019-11-03 11:47:37.117] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 115, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 58 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 291, "sum": 291.0, "min": 291}, "Total Records Seen": {"count": 1, "max": 1455000, "sum": 1455000.0, "min": 1455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 58, "sum": 58.0, "min": 58}}, "EndTime": 1572781657.117718, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 57}, "StartTime": 1572781657.004481}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=220530.454552 records/second
[2019-11-03 11:47:37.229] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 117, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 59 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 296, "sum": 296.0, "min": 296}, "Total Records Seen": {"count": 1, "max": 1480000, "sum": 1480000.0, "min": 1480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 59, "sum": 59.0, "min": 59}}, "EndTime": 1572781657.229763, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 58}, "StartTime": 1572781657.119333}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226135.826937 records/second
[2019-11-03 11:47:37.346] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 119, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 60 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 301, "sum": 301.0, "min": 301}, "Total Records Seen": {"count": 1, "max": 1505000, "sum": 1505000.0, "min": 1505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 60, "sum": 60.0, "min": 60}}, "EndTime": 1572781657.346454, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 59}, "StartTime": 1572781657.231374}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=217004.377024 records/second
[2019-11-03 11:47:37.458] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 121, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 61 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 306, "sum": 306.0, "min": 306}, "Total Records Seen": {"count": 1, "max": 1530000, "sum": 1530000.0, "min": 1530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 61, "sum": 61.0, "min": 61}}, "EndTime": 1572781657.459096, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 60}, "StartTime": 1572781657.346694}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=222127.695632 records/second
[2019-11-03 11:47:37.564] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 123, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 62 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 311, "sum": 311.0, "min": 311}, "Total Records Seen": {"count": 1, "max": 1555000, "sum": 1555000.0, "min": 1555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 62, "sum": 62.0, "min": 62}}, "EndTime": 1572781657.56538, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 61}, "StartTime": 1572781657.459356}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235513.85916 records/second
[2019-11-03 11:47:37.674] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 125, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 63 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 316, "sum": 316.0, "min": 316}, "Total Records Seen": {"count": 1, "max": 1580000, "sum": 1580000.0, "min": 1580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 63, "sum": 63.0, "min": 63}}, "EndTime": 1572781657.674904, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 62}, "StartTime": 1572781657.565617}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=228475.307499 records/second
[2019-11-03 11:47:37.780] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 127, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 64 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 321, "sum": 321.0, "min": 321}, "Total Records Seen": {"count": 1, "max": 1605000, "sum": 1605000.0, "min": 1605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 64, "sum": 64.0, "min": 64}}, "EndTime": 1572781657.780621, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 63}, "StartTime": 1572781657.675158}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=236744.830825 records/second
[2019-11-03 11:47:37.902] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 129, "duration": 121, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 65 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 326, "sum": 326.0, "min": 326}, "Total Records Seen": {"count": 1, "max": 1630000, "sum": 1630000.0, "min": 1630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 65, "sum": 65.0, "min": 65}}, "EndTime": 1572781657.903117, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 64}, "StartTime": 1572781657.78087}
[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=204263.409598 records/second
[2019-11-03 11:47:38.008] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 131, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 66 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 331, "sum": 331.0, "min": 331}, "Total Records Seen": {"count": 1, "max": 1655000, "sum": 1655000.0, "min": 1655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 66, "sum": 66.0, "min": 66}}, "EndTime": 1572781658.009231, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 65}, "StartTime": 1572781657.90343}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235951.071548 records/second
[2019-11-03 11:47:38.115] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 133, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 67 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 336, "sum": 336.0, "min": 336}, "Total Records Seen": {"count": 1, "max": 1680000, "sum": 1680000.0, "min": 1680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 67, "sum": 67.0, "min": 67}}, "EndTime": 1572781658.115497, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 66}, "StartTime": 1572781658.009521}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235577.882222 records/second
[2019-11-03 11:47:38.218] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 135, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 68 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 341, "sum": 341.0, "min": 341}, "Total Records Seen": {"count": 1, "max": 1705000, "sum": 1705000.0, "min": 1705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 68, "sum": 68.0, "min": 68}}, "EndTime": 1572781658.218867, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 67}, "StartTime": 1572781658.115755}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=242118.947176 records/second
[2019-11-03 11:47:36.720] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 119, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 60 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 301, "sum": 301.0, "min": 301}, "Total Records Seen": {"count": 1, "max": 1505000, "sum": 1505000.0, "min": 1505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 60, "sum": 60.0, "min": 60}}, "EndTime": 1572781656.720562, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 59}, "StartTime": 1572781656.613444}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=233106.50539 records/second
[2019-11-03 11:47:36.829] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 121, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 61 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 306, "sum": 306.0, "min": 306}, "Total Records Seen": {"count": 1, "max": 1530000, "sum": 1530000.0, "min": 1530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 61, "sum": 61.0, "min": 61}}, "EndTime": 1572781656.829433, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 60}, "StartTime": 1572781656.720761}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=229819.839565 records/second
[2019-11-03 11:47:36.947] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 123, "duration": 115, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 62 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 311, "sum": 311.0, "min": 311}, "Total Records Seen": {"count": 1, "max": 1555000, "sum": 1555000.0, "min": 1555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 62, "sum": 62.0, "min": 62}}, "EndTime": 1572781656.94762, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 61}, "StartTime": 1572781656.831255}
[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=214647.367204 records/second
[2019-11-03 11:47:37.049] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 125, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 63 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 316, "sum": 316.0, "min": 316}, "Total Records Seen": {"count": 1, "max": 1580000, "sum": 1580000.0, "min": 1580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 63, "sum": 63.0, "min": 63}}, "EndTime": 1572781657.050018, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 62}, "StartTime": 1572781656.949825}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=249187.496138 records/second
[2019-11-03 11:47:37.145] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 127, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 64 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 321, "sum": 321.0, "min": 321}, "Total Records Seen": {"count": 1, "max": 1605000, "sum": 1605000.0, "min": 1605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 64, "sum": 64.0, "min": 64}}, "EndTime": 1572781657.146114, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 63}, "StartTime": 1572781657.050269}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=260495.066231 records/second
[2019-11-03 11:47:37.247] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 129, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 65 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 326, "sum": 326.0, "min": 326}, "Total Records Seen": {"count": 1, "max": 1630000, "sum": 1630000.0, "min": 1630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 65, "sum": 65.0, "min": 65}}, "EndTime": 1572781657.247753, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 64}, "StartTime": 1572781657.14635}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=246215.691385 records/second
[2019-11-03 11:47:37.343] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 131, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 66 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 331, "sum": 331.0, "min": 331}, "Total Records Seen": {"count": 1, "max": 1655000, "sum": 1655000.0, "min": 1655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 66, "sum": 66.0, "min": 66}}, "EndTime": 1572781657.344179, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 65}, "StartTime": 1572781657.248007}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=259551.727125 records/second
[2019-11-03 11:47:37.451] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 133, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 67 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 336, "sum": 336.0, "min": 336}, "Total Records Seen": {"count": 1, "max": 1680000, "sum": 1680000.0, "min": 1680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 67, "sum": 67.0, "min": 67}}, "EndTime": 1572781657.451658, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 66}, "StartTime": 1572781657.344442}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=232884.920767 records/second
[2019-11-03 11:47:37.550] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 135, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 68 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 341, "sum": 341.0, "min": 341}, "Total Records Seen": {"count": 1, "max": 1705000, "sum": 1705000.0, "min": 1705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 68, "sum": 68.0, "min": 68}}, "EndTime": 1572781657.551124, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 67}, "StartTime": 1572781657.451914}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=251663.4746 records/second
[2019-11-03 11:47:37.652] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 137, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 69 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 346, "sum": 346.0, "min": 346}, "Total Records Seen": {"count": 1, "max": 1730000, "sum": 1730000.0, "min": 1730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 69, "sum": 69.0, "min": 69}}, "EndTime": 1572781657.652952, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 68}, "StartTime": 1572781657.551352}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=245806.472786 records/second
[2019-11-03 11:47:37.745] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 139, "duration": 91, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 70 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 351, "sum": 351.0, "min": 351}, "Total Records Seen": {"count": 1, "max": 1755000, "sum": 1755000.0, "min": 1755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 70, "sum": 70.0, "min": 70}}, "EndTime": 1572781657.745565, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 69}, "StartTime": 1572781657.653178}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=270211.850321 records/second
[2019-11-03 11:47:37.839] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 141, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 71 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 356, "sum": 356.0, "min": 356}, "Total Records Seen": {"count": 1, "max": 1780000, "sum": 1780000.0, "min": 1780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 71, "sum": 71.0, "min": 71}}, "EndTime": 1572781657.839737, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 70}, "StartTime": 1572781657.745817}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=265818.946941 records/second
[2019-11-03 11:47:37.946] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 143, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 72 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 361, "sum": 361.0, "min": 361}, "Total Records Seen": {"count": 1, "max": 1805000, "sum": 1805000.0, "min": 1805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 72, "sum": 72.0, "min": 72}}, "EndTime": 1572781657.94681, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 71}, "StartTime": 1572781657.839979}
[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=233723.773457 records/second
[2019-11-03 11:47:38.048] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 145, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 73 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 366, "sum": 366.0, "min": 366}, "Total Records Seen": {"count": 1, "max": 1830000, "sum": 1830000.0, "min": 1830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 73, "sum": 73.0, "min": 73}}, "EndTime": 1572781658.048629, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 72}, "StartTime": 1572781657.947054}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=245784.578458 records/second
[2019-11-03 11:47:38.153] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 147, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 74 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 371, "sum": 371.0, "min": 371}, "Total Records Seen": {"count": 1, "max": 1855000, "sum": 1855000.0, "min": 1855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 74, "sum": 74.0, "min": 74}}, "EndTime": 1572781658.153764, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 73}, "StartTime": 1572781658.048879}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=238061.139932 records/second
[2019-11-03 11:47:38.257] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 149, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 75 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 376, "sum": 376.0, "min": 376}, "Total Records Seen": {"count": 1, "max": 1880000, "sum": 1880000.0, "min": 1880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 75, "sum": 75.0, "min": 75}}, "EndTime": 1572781658.25836, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 74}, "StartTime": 1572781658.154126}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=239560.073017 records/second
[2019-11-03 11:47:38.361] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 151, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 76 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 381, "sum": 381.0, "min": 381}, "Total Records Seen": {"count": 1, "max": 1905000, "sum": 1905000.0, "min": 1905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 76, "sum": 76.0, "min": 76}}, "EndTime": 1572781658.362082, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 75}, "StartTime": 1572781658.258831}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241788.993576 records/second
[2019-11-03 11:47:38.461] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 153, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 77 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 386, "sum": 386.0, "min": 386}, "Total Records Seen": {"count": 1, "max": 1930000, "sum": 1930000.0, "min": 1930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 77, "sum": 77.0, "min": 77}}, "EndTime": 1572781658.462326, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 76}, "StartTime": 1572781658.362336}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=249697.812534 records/second
[2019-11-03 11:47:38.572] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 155, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 78 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 391, "sum": 391.0, "min": 391}, "Total Records Seen": {"count": 1, "max": 1955000, "sum": 1955000.0, "min": 1955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 78, "sum": 78.0, "min": 78}}, "EndTime": 1572781658.573086, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 77}, "StartTime": 1572781658.462574}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=225957.962151 records/second
[2019-11-03 11:47:38.327] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 137, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 69 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 346, "sum": 346.0, "min": 346}, "Total Records Seen": {"count": 1, "max": 1730000, "sum": 1730000.0, "min": 1730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 69, "sum": 69.0, "min": 69}}, "EndTime": 1572781658.327902, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 68}, "StartTime": 1572781658.219121}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=229541.126148 records/second
[2019-11-03 11:47:38.434] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 139, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 70 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 351, "sum": 351.0, "min": 351}, "Total Records Seen": {"count": 1, "max": 1755000, "sum": 1755000.0, "min": 1755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 70, "sum": 70.0, "min": 70}}, "EndTime": 1572781658.435265, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 69}, "StartTime": 1572781658.328147}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233102.359754 records/second
[2019-11-03 11:47:38.545] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 141, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 71 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 356, "sum": 356.0, "min": 356}, "Total Records Seen": {"count": 1, "max": 1780000, "sum": 1780000.0, "min": 1780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 71, "sum": 71.0, "min": 71}}, "EndTime": 1572781658.54583, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 70}, "StartTime": 1572781658.435508}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226338.885807 records/second
[2019-11-03 11:47:38.658] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 143, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 72 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 361, "sum": 361.0, "min": 361}, "Total Records Seen": {"count": 1, "max": 1805000, "sum": 1805000.0, "min": 1805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 72, "sum": 72.0, "min": 72}}, "EndTime": 1572781658.658848, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 71}, "StartTime": 1572781658.546075}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=221397.458284 records/second
[2019-11-03 11:47:38.770] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 145, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 73 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 366, "sum": 366.0, "min": 366}, "Total Records Seen": {"count": 1, "max": 1830000, "sum": 1830000.0, "min": 1830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 73, "sum": 73.0, "min": 73}}, "EndTime": 1572781658.771435, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 72}, "StartTime": 1572781658.659151}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=222358.50457 records/second
[2019-11-03 11:47:38.893] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 147, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 74 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 371, "sum": 371.0, "min": 371}, "Total Records Seen": {"count": 1, "max": 1855000, "sum": 1855000.0, "min": 1855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 74, "sum": 74.0, "min": 74}}, "EndTime": 1572781658.893545, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 73}, "StartTime": 1572781658.771688}
[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=204919.669885 records/second
[2019-11-03 11:47:39.014] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 149, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 75 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 376, "sum": 376.0, "min": 376}, "Total Records Seen": {"count": 1, "max": 1880000, "sum": 1880000.0, "min": 1880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 75, "sum": 75.0, "min": 75}}, "EndTime": 1572781659.01457, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 74}, "StartTime": 1572781658.893806}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206791.172816 records/second
[2019-11-03 11:47:39.134] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 151, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 76 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 381, "sum": 381.0, "min": 381}, "Total Records Seen": {"count": 1, "max": 1905000, "sum": 1905000.0, "min": 1905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 76, "sum": 76.0, "min": 76}}, "EndTime": 1572781659.134788, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 75}, "StartTime": 1572781659.014817}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=207656.90476 records/second
[2019-11-03 11:47:38.679] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 157, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 79 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 396, "sum": 396.0, "min": 396}, "Total Records Seen": {"count": 1, "max": 1980000, "sum": 1980000.0, "min": 1980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 79, "sum": 79.0, "min": 79}}, "EndTime": 1572781658.679666, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 78}, "StartTime": 1572781658.57351}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=235247.558516 records/second
[2019-11-03 11:47:38.783] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 159, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 80 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 401, "sum": 401.0, "min": 401}, "Total Records Seen": {"count": 1, "max": 2005000, "sum": 2005000.0, "min": 2005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 80, "sum": 80.0, "min": 80}}, "EndTime": 1572781658.783399, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 79}, "StartTime": 1572781658.68004}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241622.405081 records/second
[2019-11-03 11:47:38.878] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 161, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 81 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 406, "sum": 406.0, "min": 406}, "Total Records Seen": {"count": 1, "max": 2030000, "sum": 2030000.0, "min": 2030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 81, "sum": 81.0, "min": 81}}, "EndTime": 1572781658.879164, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 80}, "StartTime": 1572781658.783612}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=261350.800321 records/second
[2019-11-03 11:47:38.987] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 163, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 82 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 411, "sum": 411.0, "min": 411}, "Total Records Seen": {"count": 1, "max": 2055000, "sum": 2055000.0, "min": 2055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 82, "sum": 82.0, "min": 82}}, "EndTime": 1572781658.988273, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 81}, "StartTime": 1572781658.879363}
[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=229286.649859 records/second
[2019-11-03 11:47:39.083] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 165, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 83 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 416, "sum": 416.0, "min": 416}, "Total Records Seen": {"count": 1, "max": 2080000, "sum": 2080000.0, "min": 2080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 83, "sum": 83.0, "min": 83}}, "EndTime": 1572781659.084386, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 82}, "StartTime": 1572781658.988487}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=260354.065798 records/second
[2019-11-03 11:47:39.195] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 167, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 84 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 421, "sum": 421.0, "min": 421}, "Total Records Seen": {"count": 1, "max": 2105000, "sum": 2105000.0, "min": 2105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 84, "sum": 84.0, "min": 84}}, "EndTime": 1572781659.196119, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 83}, "StartTime": 1572781659.086359}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=227550.123586 records/second
[2019-11-03 11:47:39.306] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 169, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 85 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 426, "sum": 426.0, "min": 426}, "Total Records Seen": {"count": 1, "max": 2130000, "sum": 2130000.0, "min": 2130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 85, "sum": 85.0, "min": 85}}, "EndTime": 1572781659.306819, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 84}, "StartTime": 1572781659.196313}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=226019.330419 records/second
[2019-11-03 11:47:39.418] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 171, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 86 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 431, "sum": 431.0, "min": 431}, "Total Records Seen": {"count": 1, "max": 2155000, "sum": 2155000.0, "min": 2155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 86, "sum": 86.0, "min": 86}}, "EndTime": 1572781659.418981, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 85}, "StartTime": 1572781659.307034}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=223112.669583 records/second
[2019-11-03 11:47:39.535] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 173, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 87 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 436, "sum": 436.0, "min": 436}, "Total Records Seen": {"count": 1, "max": 2180000, "sum": 2180000.0, "min": 2180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 87, "sum": 87.0, "min": 87}}, "EndTime": 1572781659.535613, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 86}, "StartTime": 1572781659.421016}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=217954.308781 records/second
[2019-11-03 11:47:39.653] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 175, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 88 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 441, "sum": 441.0, "min": 441}, "Total Records Seen": {"count": 1, "max": 2205000, "sum": 2205000.0, "min": 2205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 88, "sum": 88.0, "min": 88}}, "EndTime": 1572781659.654219, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 87}, "StartTime": 1572781659.537496}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=213978.507789 records/second
[2019-11-03 11:47:39.242] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 153, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 77 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 386, "sum": 386.0, "min": 386}, "Total Records Seen": {"count": 1, "max": 1930000, "sum": 1930000.0, "min": 1930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 77, "sum": 77.0, "min": 77}}, "EndTime": 1572781659.242942, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 76}, "StartTime": 1572781659.135533}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=232437.345108 records/second
[2019-11-03 11:47:39.362] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 155, "duration": 117, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 78 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 391, "sum": 391.0, "min": 391}, "Total Records Seen": {"count": 1, "max": 1955000, "sum": 1955000.0, "min": 1955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 78, "sum": 78.0, "min": 78}}, "EndTime": 1572781659.36302, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 77}, "StartTime": 1572781659.244852}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=211332.314873 records/second
[2019-11-03 11:47:39.468] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 157, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 79 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 396, "sum": 396.0, "min": 396}, "Total Records Seen": {"count": 1, "max": 1980000, "sum": 1980000.0, "min": 1980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 79, "sum": 79.0, "min": 79}}, "EndTime": 1572781659.468665, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 78}, "StartTime": 1572781659.363264}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=236905.829698 records/second
[2019-11-03 11:47:39.579] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 159, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 80 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 401, "sum": 401.0, "min": 401}, "Total Records Seen": {"count": 1, "max": 2005000, "sum": 2005000.0, "min": 2005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 80, "sum": 80.0, "min": 80}}, "EndTime": 1572781659.579519, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 79}, "StartTime": 1572781659.468902}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=225727.398758 records/second
[2019-11-03 11:47:39.694] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 161, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 81 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 406, "sum": 406.0, "min": 406}, "Total Records Seen": {"count": 1, "max": 2030000, "sum": 2030000.0, "min": 2030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 81, "sum": 81.0, "min": 81}}, "EndTime": 1572781659.695116, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 80}, "StartTime": 1572781659.579765}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216469.033856 records/second
[2019-11-03 11:47:39.808] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 163, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 82 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 411, "sum": 411.0, "min": 411}, "Total Records Seen": {"count": 1, "max": 2055000, "sum": 2055000.0, "min": 2055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 82, "sum": 82.0, "min": 82}}, "EndTime": 1572781659.80906, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 81}, "StartTime": 1572781659.695364}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=219606.72616 records/second
[2019-11-03 11:47:39.924] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 165, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 83 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 416, "sum": 416.0, "min": 416}, "Total Records Seen": {"count": 1, "max": 2080000, "sum": 2080000.0, "min": 2080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 83, "sum": 83.0, "min": 83}}, "EndTime": 1572781659.925035, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 82}, "StartTime": 1572781659.809357}
[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=215857.64515 records/second
[2019-11-03 11:47:40.033] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 167, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 84 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 421, "sum": 421.0, "min": 421}, "Total Records Seen": {"count": 1, "max": 2105000, "sum": 2105000.0, "min": 2105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 84, "sum": 84.0, "min": 84}}, "EndTime": 1572781660.033847, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 83}, "StartTime": 1572781659.925312}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=230064.900587 records/second
[2019-11-03 11:47:40.160] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 169, "duration": 124, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 85 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 426, "sum": 426.0, "min": 426}, "Total Records Seen": {"count": 1, "max": 2130000, "sum": 2130000.0, "min": 2130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 85, "sum": 85.0, "min": 85}}, "EndTime": 1572781660.160738, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 84}, "StartTime": 1572781660.034084}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=197187.113905 records/second
[2019-11-03 11:47:39.753] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 177, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 89 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 446, "sum": 446.0, "min": 446}, "Total Records Seen": {"count": 1, "max": 2230000, "sum": 2230000.0, "min": 2230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 89, "sum": 89.0, "min": 89}}, "EndTime": 1572781659.754287, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 88}, "StartTime": 1572781659.656077}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254284.695766 records/second
[2019-11-03 11:47:39.844] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 179, "duration": 90, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 90 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 451, "sum": 451.0, "min": 451}, "Total Records Seen": {"count": 1, "max": 2255000, "sum": 2255000.0, "min": 2255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 90, "sum": 90.0, "min": 90}}, "EndTime": 1572781659.845219, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 89}, "StartTime": 1572781659.75448}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=275145.303451 records/second
[2019-11-03 11:47:39.953] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 181, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 91 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 456, "sum": 456.0, "min": 456}, "Total Records Seen": {"count": 1, "max": 2280000, "sum": 2280000.0, "min": 2280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 91, "sum": 91.0, "min": 91}}, "EndTime": 1572781659.953676, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 90}, "StartTime": 1572781659.845461}
[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=230744.313781 records/second
[2019-11-03 11:47:40.060] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 183, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 92 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 461, "sum": 461.0, "min": 461}, "Total Records Seen": {"count": 1, "max": 2305000, "sum": 2305000.0, "min": 2305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 92, "sum": 92.0, "min": 92}}, "EndTime": 1572781660.060468, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 91}, "StartTime": 1572781659.955789}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=238536.083788 records/second
[2019-11-03 11:47:40.165] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 185, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 93 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 466, "sum": 466.0, "min": 466}, "Total Records Seen": {"count": 1, "max": 2330000, "sum": 2330000.0, "min": 2330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 93, "sum": 93.0, "min": 93}}, "EndTime": 1572781660.165692, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 92}, "StartTime": 1572781660.062342}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241597.353105 records/second
[2019-11-03 11:47:40.269] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 187, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 94 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 471, "sum": 471.0, "min": 471}, "Total Records Seen": {"count": 1, "max": 2355000, "sum": 2355000.0, "min": 2355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 94, "sum": 94.0, "min": 94}}, "EndTime": 1572781660.269801, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 93}, "StartTime": 1572781660.167789}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=244587.508631 records/second
[2019-11-03 11:47:40.373] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 189, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 95 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 476, "sum": 476.0, "min": 476}, "Total Records Seen": {"count": 1, "max": 2380000, "sum": 2380000.0, "min": 2380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 95, "sum": 95.0, "min": 95}}, "EndTime": 1572781660.374038, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 94}, "StartTime": 1572781660.271758}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=244131.376699 records/second
[2019-11-03 11:47:40.472] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 191, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 96 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 481, "sum": 481.0, "min": 481}, "Total Records Seen": {"count": 1, "max": 2405000, "sum": 2405000.0, "min": 2405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 96, "sum": 96.0, "min": 96}}, "EndTime": 1572781660.473244, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 95}, "StartTime": 1572781660.374283}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=252290.783452 records/second
[2019-11-03 11:47:40.584] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 193, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 97 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 486, "sum": 486.0, "min": 486}, "Total Records Seen": {"count": 1, "max": 2430000, "sum": 2430000.0, "min": 2430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 97, "sum": 97.0, "min": 97}}, "EndTime": 1572781660.585374, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 96}, "StartTime": 1572781660.475085}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=226414.149157 records/second
[2019-11-03 11:47:40.268] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 171, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 86 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 431, "sum": 431.0, "min": 431}, "Total Records Seen": {"count": 1, "max": 2155000, "sum": 2155000.0, "min": 2155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 86, "sum": 86.0, "min": 86}}, "EndTime": 1572781660.269015, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 85}, "StartTime": 1572781660.160977}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=231130.351597 records/second
[2019-11-03 11:47:40.371] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 173, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 87 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 436, "sum": 436.0, "min": 436}, "Total Records Seen": {"count": 1, "max": 2180000, "sum": 2180000.0, "min": 2180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 87, "sum": 87.0, "min": 87}}, "EndTime": 1572781660.372365, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 86}, "StartTime": 1572781660.269253}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=242147.462543 records/second
[2019-11-03 11:47:40.486] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 175, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 88 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 441, "sum": 441.0, "min": 441}, "Total Records Seen": {"count": 1, "max": 2205000, "sum": 2205000.0, "min": 2205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 88, "sum": 88.0, "min": 88}}, "EndTime": 1572781660.487108, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 87}, "StartTime": 1572781660.372609}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=217679.211637 records/second
[2019-11-03 11:47:40.595] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 177, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 89 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 446, "sum": 446.0, "min": 446}, "Total Records Seen": {"count": 1, "max": 2230000, "sum": 2230000.0, "min": 2230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 89, "sum": 89.0, "min": 89}}, "EndTime": 1572781660.596009, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 88}, "StartTime": 1572781660.489044}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233424.603864 records/second
[2019-11-03 11:47:40.715] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 179, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 90 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 451, "sum": 451.0, "min": 451}, "Total Records Seen": {"count": 1, "max": 2255000, "sum": 2255000.0, "min": 2255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 90, "sum": 90.0, "min": 90}}, "EndTime": 1572781660.716018, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 89}, "StartTime": 1572781660.596258}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208517.47562 records/second
[2019-11-03 11:47:40.854] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 181, "duration": 138, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 91 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 456, "sum": 456.0, "min": 456}, "Total Records Seen": {"count": 1, "max": 2280000, "sum": 2280000.0, "min": 2280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 91, "sum": 91.0, "min": 91}}, "EndTime": 1572781660.855434, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 90}, "StartTime": 1572781660.716272}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=179375.916273 records/second
[2019-11-03 11:47:40.981] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 183, "duration": 124, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 92 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 461, "sum": 461.0, "min": 461}, "Total Records Seen": {"count": 1, "max": 2305000, "sum": 2305000.0, "min": 2305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 92, "sum": 92.0, "min": 92}}, "EndTime": 1572781660.981819, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 91}, "StartTime": 1572781660.855922}
[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=198300.994743 records/second
[2019-11-03 11:47:41.087] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 185, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 93 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 466, "sum": 466.0, "min": 466}, "Total Records Seen": {"count": 1, "max": 2330000, "sum": 2330000.0, "min": 2330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 93, "sum": 93.0, "min": 93}}, "EndTime": 1572781661.087967, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 92}, "StartTime": 1572781660.984266}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=240775.75379 records/second
[2019-11-03 11:47:41.189] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 187, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 94 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 471, "sum": 471.0, "min": 471}, "Total Records Seen": {"count": 1, "max": 2355000, "sum": 2355000.0, "min": 2355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 94, "sum": 94.0, "min": 94}}, "EndTime": 1572781661.189626, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 93}, "StartTime": 1572781661.089645}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=249742.415979 records/second
[2019-11-03 11:47:40.697] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 195, "duration": 110, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 98 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 491, "sum": 491.0, "min": 491}, "Total Records Seen": {"count": 1, "max": 2455000, "sum": 2455000.0, "min": 2455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 98, "sum": 98.0, "min": 98}}, "EndTime": 1572781660.698355, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 97}, "StartTime": 1572781660.587506}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=225269.616479 records/second
[2019-11-03 11:47:40.838] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 197, "duration": 137, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 99 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 496, "sum": 496.0, "min": 496}, "Total Records Seen": {"count": 1, "max": 2480000, "sum": 2480000.0, "min": 2480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 99, "sum": 99.0, "min": 99}}, "EndTime": 1572781660.838742, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 98}, "StartTime": 1572781660.700431}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=180615.201237 records/second
[2019-11-03 11:47:40.938] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 199, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 100 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 501, "sum": 501.0, "min": 501}, "Total Records Seen": {"count": 1, "max": 2505000, "sum": 2505000.0, "min": 2505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 100, "sum": 100.0, "min": 100}}, "EndTime": 1572781660.939309, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 99}, "StartTime": 1572781660.838951}
[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=248843.324315 records/second
[2019-11-03 11:47:41.293] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 189, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 95 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 476, "sum": 476.0, "min": 476}, "Total Records Seen": {"count": 1, "max": 2380000, "sum": 2380000.0, "min": 2380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 95, "sum": 95.0, "min": 95}}, "EndTime": 1572781661.294615, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 94}, "StartTime": 1572781661.189859}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=238360.94119 records/second
[2019-11-03 11:47:41.400] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 191, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 96 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 481, "sum": 481.0, "min": 481}, "Total Records Seen": {"count": 1, "max": 2405000, "sum": 2405000.0, "min": 2405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 96, "sum": 96.0, "min": 96}}, "EndTime": 1572781661.401357, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 95}, "StartTime": 1572781661.294858}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=234471.130948 records/second
[2019-11-03 11:47:41.505] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 193, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 97 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 486, "sum": 486.0, "min": 486}, "Total Records Seen": {"count": 1, "max": 2430000, "sum": 2430000.0, "min": 2430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 97, "sum": 97.0, "min": 97}}, "EndTime": 1572781661.506414, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 96}, "StartTime": 1572781661.401588}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=238201.746913 records/second
[2019-11-03 11:47:41.607] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 195, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 98 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 491, "sum": 491.0, "min": 491}, "Total Records Seen": {"count": 1, "max": 2455000, "sum": 2455000.0, "min": 2455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 98, "sum": 98.0, "min": 98}}, "EndTime": 1572781661.608669, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 97}, "StartTime": 1572781661.506654}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=244752.499282 records/second
[2019-11-03 11:47:41.708] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 197, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 99 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 496, "sum": 496.0, "min": 496}, "Total Records Seen": {"count": 1, "max": 2480000, "sum": 2480000.0, "min": 2480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 99, "sum": 99.0, "min": 99}}, "EndTime": 1572781661.70883, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 98}, "StartTime": 1572781661.608911}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=249809.6486 records/second
[2019-11-03 11:47:41.810] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 199, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 100 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 501, "sum": 501.0, "min": 501}, "Total Records Seen": {"count": 1, "max": 2505000, "sum": 2505000.0, "min": 2505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 100, "sum": 100.0, "min": 100}}, "EndTime": 1572781661.810584, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 99}, "StartTime": 1572781661.70911}
[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=246070.664214 records/second
[11/03/2019 11:47:41 INFO 140169171593024] shrinking 100 centers into 10
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #0. Current mean square distance 12.902647
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #1. Current mean square distance 11.803318
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #2. Current mean square distance 12.321064
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #3. Current mean square distance 12.036984
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #4. Current mean square distance 12.555333
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #5. Current mean square distance 12.615070
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #6. Current mean square distance 11.918087
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #7. Current mean square distance 12.279174
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #8. Current mean square distance 12.339795
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #9. Current mean square distance 12.555266
[11/03/2019 11:47:42 INFO 140169171593024] finished shrinking process. Mean Square Distance = 12
[11/03/2019 11:47:42 INFO 140169171593024] #quality_metric: host=algo-1, train msd <loss>=11.8033180237
[11/03/2019 11:47:42 INFO 140169171593024] batch data loading with context took: 38.6209%, (4.388304 secs)
[11/03/2019 11:47:42 INFO 140169171593024] compute all data-center distances: point norm took: 19.0106%, (2.160087 secs)
[11/03/2019 11:47:42 INFO 140169171593024] gradient: cluster center took: 13.1121%, (1.489863 secs)
[11/03/2019 11:47:42 INFO 140169171593024] compute all data-center distances: inner product took: 9.4443%, (1.073109 secs)
[11/03/2019 11:47:42 INFO 140169171593024] collect from kv store took: 5.5164%, (0.626799 secs)
[11/03/2019 11:47:42 INFO 140169171593024] predict compute msd took: 4.7494%, (0.539646 secs)
[11/03/2019 11:47:42 INFO 140169171593024] gradient: cluster size took: 3.1338%, (0.356081 secs)
[11/03/2019 11:47:42 INFO 140169171593024] splitting centers key-value pair took: 1.9277%, (0.219037 secs)
[11/03/2019 11:47:42 INFO 140169171593024] compute all data-center distances: center norm took: 1.5278%, (0.173592 secs)
[11/03/2019 11:47:42 INFO 140169171593024] gradient: one_hot took: 1.4084%, (0.160024 secs)
[11/03/2019 11:47:42 INFO 140169171593024] update state and report convergance took: 1.3147%, (0.149378 secs)
[11/03/2019 11:47:42 INFO 140169171593024] update set-up time took: 0.1200%, (0.013640 secs)
[11/03/2019 11:47:42 INFO 140169171593024] predict minus dist took: 0.1141%, (0.012959 secs)
[11/03/2019 11:47:42 INFO 140169171593024] TOTAL took: 11.3625204563
[11/03/2019 11:47:42 INFO 140169171593024] Number of GPUs being used: 0
#metrics {"Metrics": {"finalize.time": {"count": 1, "max": 387.3600959777832, "sum": 387.3600959777832, "min": 387.3600959777832}, "initialize.time": {"count": 1, "max": 42.871952056884766, "sum": 42.871952056884766, "min": 42.871952056884766}, "model.serialize.time": {"count": 1, "max": 0.2219676971435547, "sum": 0.2219676971435547, "min": 0.2219676971435547}, "update.time": {"count": 100, "max": 197.33190536499023, "sum": 11322.939395904541, "min": 97.9759693145752}, "epochs": {"count": 1, "max": 100, "sum": 100.0, "min": 100}, "state.serialize.time": {"count": 1, "max": 0.5171298980712891, "sum": 0.5171298980712891, "min": 0.5171298980712891}, "_shrink.time": {"count": 1, "max": 384.3569755554199, "sum": 384.3569755554199, "min": 384.3569755554199}}, "EndTime": 1572781662.199495, "Dimensions": {"Host": "algo-1", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.32371}
[11/03/2019 11:47:42 INFO 140169171593024] Test data is not provided.
#metrics {"Metrics": {"totaltime": {"count": 1, "max": 13017.530918121338, "sum": 13017.530918121338, "min": 13017.530918121338}, "setuptime": {"count": 1, "max": 30.853986740112305, "sum": 30.853986740112305, "min": 30.853986740112305}}, "EndTime": 1572781662.202104, "Dimensions": {"Host": "algo-1", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781662.199603}
[11/03/2019 11:47:41 INFO 140552810366784] shrinking 100 centers into 10
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #0. Current mean square distance 12.250052
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #1. Current mean square distance 12.186016
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #2. Current mean square distance 12.200719
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #3. Current mean square distance 11.887745
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #4. Current mean square distance 12.341534
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #5. Current mean square distance 12.504448
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #6. Current mean square distance 12.133743
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #7. Current mean square distance 12.772625
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #8. Current mean square distance 12.143409
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #9. Current mean square distance 12.344214
[11/03/2019 11:47:42 INFO 140552810366784] finished shrinking process. Mean Square Distance = 12
[11/03/2019 11:47:42 INFO 140552810366784] #quality_metric: host=algo-2, train msd <loss>=11.8877449036
[11/03/2019 11:47:42 INFO 140552810366784] batch data loading with context took: 31.9681%, (3.320623 secs)
[11/03/2019 11:47:42 INFO 140552810366784] compute all data-center distances: point norm took: 20.7105%, (2.151268 secs)
[11/03/2019 11:47:42 INFO 140552810366784] collect from kv store took: 13.6408%, (1.416910 secs)
[11/03/2019 11:47:42 INFO 140552810366784] gradient: cluster center took: 11.5084%, (1.195417 secs)
[11/03/2019 11:47:42 INFO 140552810366784] compute all data-center distances: inner product took: 9.2459%, (0.960398 secs)
[11/03/2019 11:47:42 INFO 140552810366784] predict compute msd took: 4.4798%, (0.465329 secs)
[11/03/2019 11:47:42 INFO 140552810366784] gradient: cluster size took: 3.0899%, (0.320962 secs)
[11/03/2019 11:47:42 INFO 140552810366784] gradient: one_hot took: 1.5796%, (0.164074 secs)
[11/03/2019 11:47:42 INFO 140552810366784] update state and report convergance took: 1.2818%, (0.133143 secs)
[11/03/2019 11:47:42 INFO 140552810366784] splitting centers key-value pair took: 1.1349%, (0.117886 secs)
[11/03/2019 11:47:42 INFO 140552810366784] compute all data-center distances: center norm took: 1.1272%, (0.117085 secs)
[11/03/2019 11:47:42 INFO 140552810366784] predict minus dist took: 0.1201%, (0.012476 secs)
[11/03/2019 11:47:42 INFO 140552810366784] update set-up time took: 0.1130%, (0.011741 secs)
[11/03/2019 11:47:42 INFO 140552810366784] TOTAL took: 10.3873124123
[11/03/2019 11:47:42 INFO 140552810366784] Number of GPUs being used: 0
[11/03/2019 11:47:42 INFO 140552810366784] No model is serialized on a non-master node
#metrics {"Metrics": {"finalize.time": {"count": 1, "max": 291.3999557495117, "sum": 291.3999557495117, "min": 291.3999557495117}, "initialize.time": {"count": 1, "max": 41.98312759399414, "sum": 41.98312759399414, "min": 41.98312759399414}, "model.serialize.time": {"count": 1, "max": 0.07700920104980469, "sum": 0.07700920104980469, "min": 0.07700920104980469}, "update.time": {"count": 100, "max": 179.54707145690918, "sum": 10432.80816078186, "min": 89.97201919555664}, "epochs": {"count": 1, "max": 100, "sum": 100.0, "min": 100}, "state.serialize.time": {"count": 1, "max": 0.4820823669433594, "sum": 0.4820823669433594, "min": 0.4820823669433594}, "_shrink.time": {"count": 1, "max": 288.4190082550049, "sum": 288.4190082550049, "min": 288.4190082550049}}, "EndTime": 1572781662.107717, "Dimensions": {"Host": "algo-2", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.328628}
[11/03/2019 11:47:42 INFO 140552810366784] Test data is not provided.
#metrics {"Metrics": {"totaltime": {"count": 1, "max": 13907.652139663696, "sum": 13907.652139663696, "min": 13907.652139663696}, "setuptime": {"count": 1, "max": 16.698122024536133, "sum": 16.698122024536133, "min": 16.698122024536133}}, "EndTime": 1572781662.109637, "Dimensions": {"Host": "algo-2", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781662.107824}
2019-11-03 11:47:54 Uploading - Uploading generated training model
2019-11-03 11:47:54 Completed - Training job completed
Training seconds: 142
Billable seconds: 142
CPU times: user 7.93 s, sys: 394 ms, total: 8.33 s
Wall time: 3min 21s