Date: 2023-05-09
Time: 14:00 CEST
Moderator: Leandro
Venue: https://meet.jit.si/RIOT-VMA-2023-05
VMA forum entry: https://forum.riot-os.org/t/virtual-maintainer-assembly-vma-2023-05/3923
Previous VMA pad: https://hackmd.io/y9l8MOKRT7y_a6CqlO_q3A
Previous VMA forum entry: https://forum.riot-os.org/t/virtual-maintainer-assembly-vma-2023-02/3859
Note taking: Leandro
Oleg not here. Probably can discuss in Forum.
Kevin: we have compile and test for board script. we found that boards are failing. We want to add as an asset to the release the results of the secript. It says what the current state for all boards is.
Ben: the test script is a bit too simple / brute force, runs many unit tests that are completetly hardware-independent.
Ben: actual board tests require some
Ben: they are unit tests and simple. Manytimes there are connection issues and reported as test errors. For some tests we need HW connections and running only the script does not provide enough value. We need to improve tests and provide instructions on how to run tests that actually test hardware and boards (e.g. uart gpios,…)
Kevin: sometimes we think that some things are HW independent and affec no boards, but sometimes they still fail. Running through the whole suite is not so costly?
Ben: false positives negatives are the costs. You can’t tell if there are real issues hiding in the results due to false positives negatives
Kevin: running with incremental flags shouls help with that
Martine: false positives??
Kevin: false negatives! :) we should change our test to prevent false negatives. there’s going to be a spec that says how to setup the board
Maribu: that’s make test-with-config
Kevin: I agree it’s not always easy to find the bugs, but we should work torwards fixing the flaky tests and we should have more confidence in the boards and the tests that were run on those boards for the release cycle.
Maribu: when I run the script I find real bugs. But many times it takes 1 hour per board. around 4 hours for a full cycle. if would optimize we could get the same output in less time. why do we have so many failing tests? I don’t get this so often
Kevin: a lot of the false negatives come from the iotlab tests. on local setup (8 boards) had to run multiple times and there were repeateable bugs. I used docker and a toolchain issue is still an issue. I think it’s valuable. We should get it to the point where the valyue is less compared to the time it takes
Maribu: our scripts could be improved, sometimes it has the output of the test program and matches. It should check that the flashed app is the correct one. (flashing is working). we need to update the tests when we update the applications, otherwise it’s flaky
Kevin: worth doing? attaching the results of the tests to the release?
Ben: it makes sense to add selected tests to the release. having 20 tests that actually test the hardware are more valuable than 200 tests of packages.
Kevin: we need a flag “only hadware” for running these tests whith the script
Martine: the problem with the script defines no configuration, not which configuration
Maribu: arduino shield that provides whatever the test requires for a lot of tests (e.g. wiring)
Kevin: phillip! meant for the peripheral tests
Maribu: it would be super cool! Even just a connection eg between RX and TX for uart would help. Phillip would even be better because it would catch wrong UART communication parameters (e.g. symbol rate not as configured) that a “loopback wire” woudldn’t
Kevin: OK :D attaching results to release?
no strong opinions