Rapid reboot of Arduino Opta when setting/reading device properties with watchdog enabled

Hello everyone,

I’m encountering an issue where the Arduino Opta suddenly reboots (non-watchdog reboots) under certain conditions. Specifically, the reboots occur when:
• A relay is turned on/off via the API or a dashboard widget (there is a need to turn it on/off min. 1-3 times, sometimes more).
• All of the following code elements are present:

  1. Setting a property under the resource Input.
  2. Reading and/or setting a property within the loop() function.
  3. The watchdog timer is enabled.
    If any of these elements is omitted, the reboots do not occur.

Here’s the code to reproduce the issue:

#define THINGER_SERIAL_DEBUG
#define THINGER_SERVER "acme.aws.thinger.io"

#include <ThingerMbedEth.h>
#include <ThingerPortentaOTA.h>
#include "arduino_secrets.h"

// Define the watchdog timeout
#define WATCHDOG_TIME_MS 10000

// Get the instance of the Watchdog
mbed::Watchdog& watchdog = mbed::Watchdog::get_instance(); // Required element 3

ThingerMbedEth thing(USERNAME, DEVICE_ID, DEVICE_CREDENTIAL);
ThingerPortentaOTA ota(thing);

int propertyValue2 = 1;
bool isThingerAuthenticated = false;  // Status if connected to Thinger.io

void setup() {
  Serial.begin(115200);

  // configure leds for output
  pinMode(LED_D0, OUTPUT);
  pinMode(LED_D1, OUTPUT);
  pinMode(LED_D2, OUTPUT);
  pinMode(LED_D3, OUTPUT);
  pinMode(LEDR, OUTPUT);
  pinMode(LED_BUILTIN, OUTPUT);

  // configure relays for output
  pinMode(D0, OUTPUT);
  pinMode(D1, OUTPUT);
  pinMode(D2, OUTPUT);
  pinMode(D3, OUTPUT);

  // Set state listener to update authentication status
  thing.set_state_listener([](ThingerClient::THINGER_STATE state) {
    isThingerAuthenticated = (state == ThingerClient::THINGER_AUTHENTICATED);
  });

  thing["relay_1"] << [](pson& in) {
    if (in.is_empty()) {
      in = (bool)digitalRead(D0);
    } else {
      digitalWrite(D0, in ? HIGH : LOW);
      digitalWrite(LED_D0, in ? HIGH : LOW);

      // Required element 1 start
      pson set_prop;
      set_prop["variable"] = false;
      thing.set_property("varStatus", set_prop, true);
      // Required element 1 end
    }
  };

  // start thinger on its own task
  thing.start();

  // Start the watchdog timer
  watchdog.start(WATCHDOG_TIME_MS); // Required element 3
}

void loop() {
  watchdog.kick(); // Required element 3
  if (isThingerAuthenticated) {
    // Required element 2 start
    pson send_data;
    send_data["propertyValue1"] = 1;
    thing.set_property("property_data", send_data, true);
    pson read_data;
    thing.get_property("property_data2", read_data);
    propertyValue2 = read_data["propertyValue2"];
    // Required element 2 end
  }
  delay(2000);
}

Has anyone experienced a similar problem or can offer insights into why this might be happening? Any assistance would be greatly appreciated.

Thank you!

Upon further investigation, it appears that the reboots are being triggered by the watchdog timer. Frequently pressing the relay control button (via the dashboard or API) causes significant delays in the loop, which eventually leads to a watchdog timeout.
I don’t know how communication across different threads is being handled in this context.

Hi,

Do you mean pressing many times in a short period of time? I ask this because is weird, the communication should be fast enough to avoid trigger the watchdog, however I dont know how long is this watchdog timer.

Have you tested disabling the WDT? this in order to confirm that this is the rebooting cause.

Hope this helps.

Hi,
Yes, I’m referring to rapidly pressing the button multiple times within a short period. I measured the loop time, and without pressing the button, it averages around 90 ms. However, when the button is pressed, the loop time easily extends to 10 seconds or more, which eventually triggers the watchdog. It seems that the set_property and get_property functions, when combined with the RTOS, are causing the blocking behavior that leads to the watchdog timeout. To mitigate this, I’ve increased the watchdog timeout to 30 seconds.

Just to be sure: Did you check your power supply?
A weak power supply may lead to brownouts when relays are activated.

On myside I also have some reboots when my Telecom provider (aaaaargh!) delivers again crappy Internet connections. I have defined a grace pause handling to help dealing with this annoyance:

#if defined(GRACE_PAUSE)  // Prevent locking if Thinger does not answer
  if (GracePause > 0) GracePause--;
  thingHandleTime = millis();
  if (GracePause == 0) thing.handle();                    // do not call permanently Thinger if it takes too long to respond.
  if (millis() - thingHandleTime > 500) GracePause = 16;  // if the Thinger call took longer than 500ms, make 2s pause before retrying
#else
  thing.handle();
#endif

Thank you for your suggestion! I can confirm that the power supply is not the issue, as I’m using a 3.5A source, which should be sufficient. The measurements I took show that the loop time is around 90 ms without any action, but increases significantly when switching the relay. I will try switching to a fiber router instead of the GSM router and repeat the tests to see if the results (loop times) are similar. Thanks again for your input!

" but increases significantly when switching the relay "
Can you post the code that does that?
A loop time of 90 sec is quite long.
I do really a LOT with my weak ESP8266 and my loop times are just a few mS:

GracePause: 000
Job Durations(mS) Current - Max
Sche:000 - 001 
Fast:001 - 003
Slow:007 - 008
Stat:000 - 000
Disp:002 - 004
Seri:003 - 002
Wifi:001 - 002

Without relay switching, the loop time is ~90 ms (milliseconds). With switching, it gets to order of seconds.

Below is the full code. Please note that inside the "relay_1" I use set_property which is one of the required elements to have this behavior.

#define THINGER_SERIAL_DEBUG
#define THINGER_SERVER "acme.aws.thinger.io"

#include <ThingerMbedEth.h>
#include <ThingerPortentaOTA.h>
#include "arduino_secrets.h"

// Define the watchdog timeout
#define WATCHDOG_TIME_MS 10000

// Get the instance of the Watchdog
mbed::Watchdog& watchdog = mbed::Watchdog::get_instance(); // Required element 3

ThingerMbedEth thing(USERNAME, DEVICE_ID, DEVICE_CREDENTIAL);
ThingerPortentaOTA ota(thing);

int propertyValue2 = 1;
bool isThingerAuthenticated = false;  // Status if connected to Thinger.io

void setup() {
  Serial.begin(115200);

  // configure leds for output
  pinMode(LED_D0, OUTPUT);
  pinMode(LED_D1, OUTPUT);
  pinMode(LED_D2, OUTPUT);
  pinMode(LED_D3, OUTPUT);
  pinMode(LEDR, OUTPUT);
  pinMode(LED_BUILTIN, OUTPUT);
  pinMode(LED_USER, OUTPUT);

  // configure relays for output
  pinMode(D0, OUTPUT);
  pinMode(D1, OUTPUT);
  pinMode(D2, OUTPUT);
  pinMode(D3, OUTPUT);

  // Set state listener to update authentication status
  thing.set_state_listener([](ThingerClient::THINGER_STATE state) {
    isThingerAuthenticated = (state == ThingerClient::THINGER_AUTHENTICATED);
  });

  thing["relay_1"] << [](pson& in) {
    if (in.is_empty()) {
      in = (bool)digitalRead(D0);
    } else {
      digitalWrite(D0, in ? HIGH : LOW);
      digitalWrite(LED_D0, in ? HIGH : LOW);

      // Required element 1 start
      pson set_prop;
      set_prop["propertyValue3"] = false;
      thing.set_property("property_data3", set_prop, true);
      // Required element 1 end
    }
  };

  // start thinger on its own task
  thing.start();

  // Start the watchdog timer
  watchdog.start(WATCHDOG_TIME_MS); // Required element 3
}

void loop() {
  unsigned long loopStartTime = millis(); 

  watchdog.kick(); // Required element 3
  
  if (isThingerAuthenticated) {
    digitalWrite(LED_USER, HIGH);

    // Required element 2 start
    pson send_data;
    send_data["propertyValue1"] = 1;
    thing.set_property("property_data1", send_data, true);
    
    pson read_data;
    thing.get_property("property_data2", read_data);
    propertyValue2 = read_data["propertyValue2"];
    // Required element 2 end
  } else {
    digitalWrite(LED_USER, LOW);
  }

  unsigned long loopEndTime = millis(); 
  unsigned long loopDuration = loopEndTime - loopStartTime; 

  Serial.print("Loop duration: ");
  Serial.print(loopDuration);
  Serial.println(" ms");

  delay(2000);
}

Why do you use a delay?

Do you have thinger process running in other core?

The whole code seems very short, 90ms looks like a very long time for that.

Delay is there for no reason. Yes, the thinger process is running in other core.

I use this to turn on/off relay:

I tested the GSM router and Fiber router:

Results:

GSM router; Loop without turning on/off relay:

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 123 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 100 ms

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 105 ms

GSM router; Loop with turning on/off relay repeatedly:

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 19

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 7 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 19

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 7 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[_SOCKET] Cannot read from socket!

Loop duration: 10158 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 230 ms

Fiber router; Loop without turning on/off relay:

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 28 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 31 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 37 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 35 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 29 ms

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

Loop duration: 32 ms

Fiber router; Loop with turning on/off relay repeatedly:

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Available bytes: 19

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 7 [OK]

[THINGER] Available bytes: 20

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 8 [OK]

[THINGER] Writing bytes: 44 [OK]

[THINGER] Writing bytes: 21 [OK]

[THINGER] Writing bytes: 43 [OK]

[THINGER] Writing bytes: 8 [OK]

[_SOCKET] Cannot read from socket!

Loop duration: 10063 ms

Only now I noticed that there is an issue with the socket.

Also, type of router makes a difference.

Blame your Internet provider ! I have the same problem here.
Many http://exchanges are suffering from awful lags. On a website it just leads to a short no-update, but on IoT expecting deterministic response times it means “reboot”.

I did additional test:
I translated the code above to communicate over WiFi on Opta (device 1) but instead of relay I just used built-in LED, and I removed the delay at the end. Then I also moved this code on Arduino WiFi Rev.2 (device 2), to compare the results. Without switching on/off relay (LED) the loop duration on Opta is ~40 ms:

[THINGER] Writing bytes: 44 [OK]
[THINGER] Writing bytes: 21 [OK]
Loop duration: 38 ms
[THINGER] Writing bytes: 43 [OK]
[THINGER] Writing bytes: 21 [OK]
Loop duration: 40 ms

while on WiFi Rev.2 it is ~10000 ms:

[THINGER] Writing bytes: 21 [OK]
[_SOCKET] Cannot read from socket!
Loop duration: 10031 ms
[THINGER] Writing bytes: 44 [OK]
[THINGER] Writing bytes: 21 [OK]
[_SOCKET] Cannot read from socket!
Loop duration: 10031 ms

What [_SOCKET] Cannot read from socket! exactly means? If this is related to the Internet provider only, then why during uninterrupted loop (with triggering the led/relay), I do not see this issue on Opta and on WiFi Rev.2 it is occurring every loop?

Here is the code for WiFi Rev.2:

#define THINGER_SERIAL_DEBUG
#define THINGER_USE_STATIC_MEMORY
#define THINGER_STATIC_MEMORY_SIZE 512
#define _DISABLE_TLS_

// define private server instance
#define THINGER_SERVER "acme.aws.thinger.io"

#include <WiFi.h>
#include <ThingerWifi.h>
#include "arduino_secrets.h"

ThingerWifi thing(USERNAME, DEVICE_ID, DEVICE_CREDENTIAL);

int propertyValue2 = 1;
bool isThingerAuthenticated = false;  // Status if connected to Thinger.io

void setup() {
  // open serial for debugging
  Serial.begin(115200);

  // Set state listener to update authentication status
  thing.set_state_listener([](ThingerClient::THINGER_STATE state) {
    isThingerAuthenticated = (state == ThingerClient::THINGER_AUTHENTICATED);
  });

  pinMode(LED_BUILTIN, OUTPUT);

  thing["relay_1"] << [](pson& in) {
    if (in.is_empty()) {
      in = (bool)digitalRead(LED_BUILTIN);
    } else {
      digitalWrite(LED_BUILTIN, in ? HIGH : LOW);

      // Required element 1 start
      pson set_prop;
      set_prop["propertyValue3"] = false;
      thing.set_property("property_data3", set_prop, true);
      // Required element 1 end
    }
  };

  // Configure WiFi network
  thing.add_wifi(SSID, SSID_PASSWORD);
}

void loop() {
  thing.handle();

  unsigned long loopStartTime = millis(); 

  if (isThingerAuthenticated) {
    // Required element 2 start
    pson send_data;
    send_data["propertyValue1"] = 1;
    thing.set_property("property_data1", send_data, true);

    pson read_data;
    thing.get_property("property_data2", read_data);
    propertyValue2 = read_data["propertyValue2"];
    // Required element 2 end
  }

  unsigned long loopEndTime = millis();
  unsigned long loopDuration = loopEndTime - loopStartTime;

  Serial.print("Loop duration: ");
  Serial.print(loopDuration);
  Serial.println(" ms");
}

Hi,

I would recommend to have the thinger instructions at one core and the control process at the other, to have more order, it would be difficult to follow what is going on if both cores are sending instructions.

I would do this by bool variables, for example if the control core need to update a property, I would declare a variable as global

bool isUpdatingProperty = 0;

when needed to update

isUpdatingProperty = 1;

And at the thinger core

if(isUpdatingProperty)
{
//update property instruction and other actions if needed...
isUpdatingProperty = 0;
}

This will warrantee that the setting property will just run once, I mention this bc the line

I[quote=“aeromek, post:8, topic:5224”]
if (isThingerAuthenticated)
[/quote]

It is allowing to set the “property_data1” and reading “property_data2” as fast as the uC can, I would recommend to control this update and reading frequency, is not scalable having a kind of devices writing/reading without control to a cloud instance.

Hope this helps.

1 Like

Thank you so much @ega! I’ve modified it according to your suggestions, and now everything is working as it should.

Edit:
I’m not getting consistent results at the moment… I’ll run some additional tests and follow up with the findings.

2 Likes

Hi,

Those are good news! Regardless the connection? Im curious, how much takes the loop after the modifications.

BR