TimescheleDB: 2.15.2
PostgreSQL: 14.5
select * from timescaledb_information.job_errors;
job_id | proc_schema | proc_name | pid | start_time | finish_time | sqlerrcode | err_message
--------±-----------------------±-----------------±--------±------------------------------±------------------------------±-----------±------------------------------------
1 | _timescaledb_functions | policy_telemetry | 1010670 | 2024-06-11 12:12:39.097899-04 | 2024-06-11 12:12:39.159925-04 | | job crash detected, see server logs
1 | _timescaledb_functions | policy_telemetry | 1044003 | 2024-06-11 13:12:39.100308-04 | 2024-06-11 13:12:42.168173-04 | | job crash detected, see server logs
1 | _timescaledb_functions | policy_telemetry | 1077300 | 2024-06-11 14:12:39.104991-04 | 2024-06-11 14:12:42.232221-04 | | job crash detected, see server logs
(3 rows)
select * from timescaledb_information.job_history;
id | job_id | succeeded | proc_schema | proc_name | pid | start_time | finish_time | config | sqlerrcode | err_message
----±-------±----------±-----------------------±-----------------±--------±------------------------------±------------------------------±-------±-----------±------------------------------------
1 | 1 | f | _timescaledb_functions | policy_telemetry | 1010670 | 2024-06-11 12:12:39.097899-04 | 2024-06-11 12:12:39.159925-04 | | | job crash detected, see server logs
2 | 1 | f | _timescaledb_functions | policy_telemetry | 1044003 | 2024-06-11 13:12:39.100308-04 | 2024-06-11 13:12:42.168173-04 | | | job crash detected, see server logs
3 | 1 | f | _timescaledb_functions | policy_telemetry | 1077300 | 2024-06-11 14:12:39.104991-04 | 2024-06-11 14:12:42.232221-04 | | | job crash detected, see server logs
4 | 1 | f | _timescaledb_functions | policy_telemetry | 1113751 | 2024-06-11 15:12:39.105129-04 | 2024-06-11 15:12:39.114045-04 | | | job crash detected, see server logs
(4 rows)
Anything insightful in the server logs?
select * from timescaledb_information.job_stats;
hypertable_schema | hypertable_name | job_id | last_run_started_at | last_successful_finish | last_run_status | job_status | last_run_duration | next_start
| total_runs | total_successes | total_failures
-----------------------±----------------------------±-------±------------------------------±------------------------------±----------------±-----------±------------------±-------------------------
-----±-----------±----------------±---------------
_timescaledb_internal | _materialized_hypertable_2 | 1000 | 2024-06-13 10:03:15.95675-04 | 2024-06-13 10:03:15.973862-04 | Success | Scheduled | 00:00:00.017112 | 2024-06-13 10:33:15.97386
2-04 | 66 | 66 | 0
_timescaledb_internal | _materialized_hypertable_3 | 1001 | 2024-06-12 12:20:02.797604-04 | 2024-06-12 12:21:39.515605-04 | Success | Scheduled | 00:01:36.718001 | 2024-06-13 12:21:39.51560
5-04 | 2 | 2 | 0
_timescaledb_internal | _materialized_hypertable_38 | 1002 | 2024-06-13 10:14:45.788037-04 | 2024-06-13 10:14:45.802624-04 | Success | Scheduled | 00:00:00.014587 | 2024-06-13 10:44:45.80262
4-04 | 86 | 86 | 0
_timescaledb_internal | _materialized_hypertable_39 | 1003 | 2024-06-12 15:37:58.197583-04 | 2024-06-12 15:37:58.221794-04 | Success | Scheduled | 00:00:00.024211 | 2024-06-13 15:37:58.22179
4-04 | 2 | 2 | 0
_timescaledb_internal | _materialized_hypertable_40 | 1004 | 2024-06-13 09:45:22.377364-04 | 2024-06-13 09:45:22.391768-04 | Success | Scheduled | 00:00:00.014404 | 2024-06-13 10:15:22.39176
8-04 | 68 | 68 | 0
_timescaledb_internal | _materialized_hypertable_41 | 1005 | 2024-06-12 15:38:23.25869-04 | 2024-06-12 15:38:23.286633-04 | Success | Scheduled | 00:00:00.027943 | 2024-06-13 15:38:23.28663
3-04 | 2 | 2 | 0
| | 1 | 2024-06-11 15:12:39.105046-04 | -infinity | Failed | Scheduled | 00:00:00.008924 | 2024-06-11 16:12:39.10504
6-04 | 4 | 0 | 4
| | 3 | 2024-06-11 12:12:39.098481-04 | 2024-06-11 12:12:39.108912-04 | Success | Scheduled | 00:00:00.010431 | 2024-06-30 19:00:00-04
| 1 | 1 | 0
(8 rows)
These jobs didn’t appear in the timescaledb_information.job_history
Nothing in the server log which would indicate any issue
Are you facing any OOM errors?
I’d recommend you to create a minimal POC and fill an issue on the main github repository for the extension.
Please, also put details on the machine and OS settings as we’ll need to try this out locally to try to fix it.
We don’t see OOM.
There are 8 cores and 64GB RAM. I am not sure what the minimal POC should look like
If this is the telemetry crashing, probably it’s because your server is not able to connect to the internet. You can also disable telemetry: Timescale Documentation | Telemetry and version checking
For debugging the jobs, you can add the job manually, not schedule it and then try run_job
with the job_id of your job.
It’s because the default is only logging failed jobs. In order to get all jobs logged, I need to set timescaledb.enable_job_execution_logging = ‘on’
I was more thinking on run it on your session and make the session verbose.
SET client_min_messages to DEBUG;
select run_job(1003);