Today, when I was burning the firmware on the ESP32 (using Arduino-Library 2.31 and Platformio espressif32@6.8.1 (I also tested with the versions espressif32@6.9.0 and espressif32@6.10.0) and trying to connect to my thinger server, I got this error message regarding SSL:
21:35:45.103 > [_SOCKET] Connecting to XXXXXXXXXXXX.aws.thinger.io:25202...
21:35:45.110 > [_SOCKET] Using secure TLS/SSL connection: yes
21:35:45.114 > Thinger State Listener: SOCKET_CONNECTING
21:35:46.691 > [_SOCKET] Connected!
21:35:46.692 > Thinger State Listener: SOCKET_CONNECTED
21:35:46.697 > [THINGER] Authenticating. User: iot_public Device: XXXXXXXXXXXX
21:35:46.703 > Thinger State Listener: THINGER_AUTHENTICATING
21:35:46.703 > [THINGER] Writing bytes: 56 [OK]
21:35:46.893 > [ 5901][E][ssl_client.cpp:37] _handle_error(): [data_to_read():361]: (-29184) SSL - An invalid SSL record was received
21:35:46.893 > [ 5901][E][ssl_client.cpp:37] _handle_error(): [data_to_read():361]: (-29184) SSL - An invalid SSL record was received
21:35:56.706 > [_SOCKET] Cannot read from socket!
21:35:56.706 > [THINGER] Auth Failed! Check username, device id, or device credentials.
21:35:56.723 > [_SOCKET] Is now closed!
The ESP32 keeps trying to connect, there are many messages in the Inspector indicating that the device connected and then disconnected… Until the device manages to connect and stabilizes the connection.
Apparently, everything works fine with the ESP32 Single Core (without FreeRTOS).
So far, we have only seen the SSL error occur on the ESP32 Duo Core (with FreeRTOS). The ESP32 with FreeRTOS takes about 5 to 10 minutes to stabilize the connection, after making several attempts.
TLS spec says that messages can be up to 16KB, but this requires 32KB of RAM on the ESP8266 (16KB RX buffer + 16KB TX buffer.) So by default it’s turned down to 4KB, which is fine as long as the server doesn’t try to send a longer message. Streaming static files is a good way for this to happen, though! (In practice, most big servers use a TLS gateway that encrypts fragments of data from the HTTP server, so even reading static files you don’t often see full 16KB messages - but it’s always a possibility.)
Two ways to fix this:
Enable the RFC 6066 max_fragment_length extension in the TLS config of the server. This is the best option, because it means client and server will negotiate a maximum 4KB message size. Most major third party HTTPS servers don’t seem to support this extension, unfortunately.
Increase the max message size in the mbedTLS config (at the cost of free RAM). There is a patch for mbedTLS that I wrote last year that allows TX & RX buffer sizes to be different, which saves some of the RAM overhead by allowing you to grow the RX buffer only. I should get around to looking at that again!
I finally had some time to put to this! … and it took … some time …
So, I needed to use the feature to add a timeout for making the initial connection with AWS, once this was done I was able to get past the -2 BUGNUM error and onto the bug. The bug was that the perform_ssl_handshake function was making a solid connection and then relying upon the return value of get_record_expansion to pass on the good connection. If record_expansion was not avilaible or returned a minus something then this told the client the connection has failed when it hadn’t. I am still not on why the get_record_expansion can do this but it is only something we need to check if need to know how many extra bytes the protocol will be adding (I think most of the time this is small enough to not be a worry).
If you would like to test this against your hardware add this line to your platformio.ini instead of the usual include for SSLClient and you will pulling in the branch with this fix in.
During some testing, a new error appeared. But only once: [189240][E][ssl_client.cpp:37] _handle_error(): [data_to_read():361]: (-29056) SSL - Verification of the message MAC failed
Though a similar question got answered in the [Nordic DevZone](https://devzone.nordicsemi.com/f/nordic-q-a/89820/nrf9160-modem_key_mgmt_cmp-returns--1) it may be also the answer for this question.
For openssl, "010102030405060708090a0b0c0d0e0f" results in a 16 bytes secret. About
b'010102030405060708090a0b0c0d0e0f'
I'm not that sure. From other SO questions, I think it's a 32 bytes secret. If the peers don't share the same secret, the handshake fails. Some implementations will simple timeout the handshake, other may report a MAC validation error, because a mismatching secret creates different association keys, and with that, the MAC validation of the handshake 'Finish' fails.
Either use "30 31 30 31 30 32 30 33 30 34 30 35 30 36 30 37 30 38 30 39 30 61 30 62 30 63 30 64 30 65 30 66" (remove the spaces!) for openssl, or use b'\x01\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f' for python.
Hope, that works.
23:15:42.714 > [505763][E][ssl_client.cpp:37] _handle_error(): [data_to_read():361]: (-29184) SSL - An invalid SSL record [505763][E][ssl_client.cpp:37] _hawas receivedndle_error(
23:15:42.715 > ): [data_to_read():361]: (-29184) SSL - An invalid SSL record was received
.
.
.
23:15:42.498 > [_SOCKET] Connected!
23:15:42.519 > [THINGER] Authenticating. User: iot_public Device: XXXXXXXXXXXXXXXXX
23:15:42.521 > Thinger State Listener: THINGER_AUTHENTICATING
23:15:42.521 > [THINGER] Writing bytes: 56 [OK]
23:15:42.714 > [505763][E][ssl_client.cpp:37] _handle_error(): [data_to_read():361]: (-29184) SSL - An invalid SSL record [505763][E][ssl_client.cpp:37] _hawas receivedndle_error(
23:15:42.715 > ): [data_to_read():361]: (-29184) SSL - An invalid SSL record was received
23:15:52.527 > [_SOCKET] Cannot read from socket!
23:15:52.530 > [THINGER] Auth Failed! Check username, device id, or device credentials.
23:15:52.545 > [_SOCKET] Is now closed!
23:15:57.548 > [_SOCKET] Connecting to XXXXXX.aws.thinger.io:25202...
We have tested an ESP32 with FreeRTOS and did not observe anything unusual:
[NETWORK] Starting connection...
[NETWORK] Connecting to network XXXXXXX
[NETWORK] Connected to WiFi!
[NETWORK] Getting IP Address...
[NETWORK] Got IP Address: 192.168.1.144
[NETWORK] Connected!
[_SOCKET] Connecting to xxxx.aws.thinger.io:25202...
[_SOCKET] Using secure TLS/SSL connection: yes
[_SOCKET] Connected!
[THINGER] Authenticating. User: alvarolb Device: esp32rtos
[THINGER] Writing bytes: 39 [OK]
[THINGER] Authenticated
Can you try to check the WiFi Signal of such devices?
void loop() {
// use loop as in normal Arduino Sketch
// use thing.lock() thing.unlock() if using variables exposed on thinger resources
Serial.printf("WiFi Signal: %d dB\n", WiFi.RSSI());
delay(1000);
}
Hi @alvarolb
We will redo the tests now and post the details of the results.
We will do the tests with Arduino IDE (with board versions esp 3.1.1 and esp 2.0.17). We will bring the results later today.
Hello, @alvarolb
Apparently, we discovered the origin of the BUG.
Everything indicates that using FreeRTOS + Thinger ( thing.start(); in core 0) and calling thing.is_connected() in core 1 (very frequently) causes some access conflict, which generates the SSL error.
This leaves a question: do we need to do some blocking when we call the thing object in the loop of core 1? We will discuss this in a specific topic.