How to restart all job queues? Use telemetry, Power Platform, and a small AL development

Last two weeks I have been testing the concept that I had in my mind for a long time. How to use the telemetry to be able to restart job queues.

Finally, I found the company that needed it. To be honest, I was thinking to do the prototype for the small customer first but… it turned out that my idea needs to handle at the same time 4 environments, multiple companies, and almost 100 Job Queue Entries in total that are running almost every 10 minutes or less (when the company will be fully operated there will be 10 environments with multiple companies and over 300 Job Queue Entries to manage).

The best it’s working! And to achieve that I just needed three things:

  1. telemetry turned on all environments (that were already in place)
  2. one simple API was published (in my opinion should be also added to standard APIs – will try to do a pull request to Microsoft in my spare time)
  3. Power Automate flow  

Moreover, the customer also needed the summary of all the Job Queue Entries that failed and how many times per day (btw my job was not to focus on “why” but just create a restart and send a notification). For that one more flow was helpful – however, I will not show you this today.

So how to deal with it?

Common practice was in the past to create a job queue that will run and restart other job queues. Ok, but what if I would also send email notifications about it or something more crazy? Then more development is needed.  I also could look on AppSource for a solution – and wow I counted at least 12 such solutions. Some free some paid (one even 900$). (Side comment: do we need so many solutions for such small things? And then we have over 2000 apps on AppSource).

So why did I choose a different approach? For my purpose, I just did not feel that any of the extensions is flexible enough. At some moment anyway, I would need to do custom development – that I want to avoid.  The other reason was that I do not want to manage separately each environment and each company for the Job Queue Monitoring. I just need one single source for all environments and companies so that if someone will create a new company I have no additional work.

Step 1: Turn on the telemetry

The only thing that someone needs to do for a new environment is enable telemetry for the environment.

I am already using telemetry which gives a lot of information about Job Queues. This is why I decided to do a combination of telemetry and Power Automate.

Step 2: Create API for restarting job queues

There are two things that you cannot do with telemetry. The first one is (I think it is obvious) that you cannot restart Job Queue Entries. Second is that since telemetry cannot expose any data privacy information means that because Microsoft does not know what is in the error message the job queue errors are not shown.

Mostly because of point 1, the API is needed however, it can solve both problems. The code you can find is below.

page 85100 "MNB Job Queue Entries"
{
    PageType = API;
    Caption = 'jobQueueEntry';
    APIPublisher = 'bc4all';
    APIGroup = 'custom';
    APIVersion = 'v1.0';
    EntityName = 'jobQueueEntry';
    EntitySetName = 'jobQueueEntries';
    SourceTable = "Job Queue Entry";
    DelayedInsert = true;
    InsertAllowed = false;
    DeleteAllowed = false;


    layout
    {
        area(Content)
        {
            repeater(APIFields)
            {
                field(id; Rec.ID) { }
                field(objectType; Rec."Object Type to Run") { }
                field(objectId; Rec."Object ID to Run") { }
                field(description; Rec.Description) { }
                field(status; Rec.Status) { }
                field(errorMessage; Rec."Error Message") { }
                field(systemId; Rec."SystemId") { }
            }
        }
    }

    [ServiceEnabled]
    procedure RestartJobQueue(var actionContext: WebServiceActionContext)
    var
        JobQueueEntry: Record "Job Queue Entry";
    begin
        JobQueueEntry.Get(Rec.ID);
        if JobQueueEntry.Status <> JobQueueEntry.Status::Error then
            exit;
        JobQueueEntry.Restart();
        actionContext.AddEntityKey(JobQueueEntry.FieldNo(Status), JobQueueEntry.Status);
        actionContext.SetResultCode(WebServiceActionResultCode::Updated);
    end;
}

Step 3: Create a Power Automate flow

I would love to share the flow with you however that seems not possible – this is because when trying to import it I found some issues. The flow, instead of the constant values, in some cases, use variables and connector for Business Central does not like it during import. But let me guide you on how to build it by yourself.

(I hope in the future I would be able to publish the solution as a template on Power Automate)

The Flow

First of all, you need to specify what is the recurrence when the flow will be run. That of course depends on the process of job queues if you run them every 5 minutes then then set recurrence to run every 4/5 minutes.

Next, you need to create 2 technical variables. Do not set them in the beginning. You will get values from the telemetry data and assign them to the variables at the proper time.

In the next step, You can run the telemetry query. It gets all failed Job Queue Entries that run in the given time. But here you needed to exclude a few special cases.

The first exclusion is that when the Job Queue Entry is run in the foreground in Business Central it is created as a copy of the Job Queue Entry. Such entries are logged in telemetry with empty Task Id (00000000-0000-0000-0000-000000000000). And are deleted automatically so they cannot be restarted.

The second kind of job queue task that should be excluded are webhooks. It turned out during my testing that the codeunit API Webhook Notification Send is created in the background and even if fails it is removed from the Job Queue Entries automatically. It means that the API cannot get the record. This is why I filtered the ObjectId 6154 in the below query.

I think there could be more excludes however for now did not find any other (I guess sending emails might be one of them).

Below you can find the query that I use to get the failed job queues. I set the time range in query with a variable but you can set it without it.

traces
| where timestamp > ago(@{variables('TimeSpanAgo')})
| where customDimensions has 'AL0000HE7'
| where customDimensions.eventId == 'AL0000HE7'
| where customDimensions.alJobQueueScheduledTaskId != '00000000-0000-0000-0000-000000000000'
| where customDimensions.alJobQueueObjectId  != '6154'
| where toint(customDimensions.alJobQueueNumberOfAttemptsToRun) == toint(customDimensions.alJobQueueMaxNumberOfAttemptsToRun)
| project timestamp
, aadTenantId = customDimensions.aadTenantId
, environmentName = customDimensions.environmentName
, environmentType = customDimensions.environmentType
, companyName = customDimensions.companyName
, alJobQueueId = customDimensions.alJobQueueId 	
, alJobQueueObjectId = customDimensions.alJobQueueObjectId 	
, alJobQueueObjectName = customDimensions.alJobQueueObjectName 
, alJobQueueObjectType = customDimensions.alJobQueueObjectType
, alJobQueueObjectDescription = customDimensions.alJobQueueObjectDescription 
, alJobQueueStatus = customDimensions.alJobQueueStatus
, alJobQueueExecutionTimeInMs = customDimensions.alJobQueueExecutionTimeInMs
, alJobQueueStacktrace = customDimensions.alJobQueueStacktrace 
, alJobQueueNumberOfAttemptsToRun = customDimensions.alJobQueueNumberOfAttemptsToRun 
, alJobQueueMaxNumberOfAttemptsToRun = customDimensions.alJobQueueMaxNumberOfAttemptsToRun 
, taskId = customDimensions.alJobQueueScheduledTaskId 

The next step is to check if there are any records in the result if not then cancel the flow.

The loop that contains all failed Job Queue Entries which have been found. The first thing that needs to be set from the telemetry signal is the Environment Name.

Later, you need to get the Company Id. This is needed because in telemetry only Company Name is sent. However, when connecting using the Business Central connector from Power Automate it is needed to provide the company id instead (if the custom value is used). Therefore you need to find the company id in the environment where the Job Queue failed.

When you already have the Company Id you can assign it to the variable that has been created in the beginning and run two Business Central actions –get and then restart the Job Queue Entry.

Little more magic when using Power Automate

Between or after those two last steps (GetJobQUeueEntryRecord and RestartJobQueueEntry) you can add anything you like: sending email, sending notifications on Teams, and more.

You even can very quickly add the steps:

  • Get the URL for Job Queue Entry Card
  • Send approvals to Teams

Below you can find the example. First, approval is required before restarting a job queue.

And then in Microsoft Teams, you would be able to see the approval request similar to below.

With all details about the failed Job Queue Entry and the link directly to Business Central. After approval, the Job Queue Entry will be restarted.

One Comment

Add a Comment

Your email address will not be published.