The intent of the outputCfgs argument to the configure() function of
converter classes and the softISP is to allow the passed in stream-configs
to not be changed.
But only the vector is const, the reference inside the vector are not
const, which allows modifying the stream-configs as can be seen inside
DebayerEGL::configure() which was using a non const reference outputCfg
helper variable.
Fix this by making the references inside the vector const.
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Reviewed-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
As shown by commit 94d32fdc55 ("pipeline: simple: Consider output sizes
when choosing pipe config"), the extra pixel columns CPU debayering
requires on the input side makes resolution selection non trivial.
Add logging of the selected input config on a successful configure() so
that the logs clearly show which sensor mode has been selected.
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Reviewed-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Add CPU soft ISP multi-threading support.
Benchmark results for the Arduino Uno-Q with a weak CPU which is good for
performance testing, all numbers with an IMX219 running at
3280x2464 -> 3272x2464:
1 thread : 147ms / frame, ~6.5 fps
2 threads: 80ms / frame, ~12.5 fps
3 threads: 65ms / frame, ~15 fps
Adding a 4th thread does not improve performance.
Tested-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> # ThinkPad X1 Yoga Gen 7 + ov2740
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Add a DebayerCpuThreadclass and use this in the inner render loop.
This contains data which needs to be separate per thread.
This is a preparation patch for making DebayerCpu support multi-threading.
Benchmarking on the Arduino Uno-Q with a weak CPU which is good for
performance testing, shows 146-147ms per 3272x2464 frame both before and
after this change, with things maybe being 0.5 ms slower after this change.
Tested-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> # ThinkPad X1 Yoga Gen 7 + ov2740
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Make the storage used to accumulate the RGB sums and the Y histogram
value a vector of SwIspStats objects instead of a single object so
that when using multi-threading every thread can use its own storage to
collect intermediate stats to avoid cache-line bouncing.
Benchmarking with the GPU-ISP which does separate swstats benchmarking,
on the Arduino Uno-Q which has a weak CPU which is good for performance
testing, shows 20ms to generate stats for a 3272x2464 frame both before
and after this change.
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
With the GPU accelerated softISP 2 separate benchmark results are printed,
1 for the generation of the output images on the GPU and a separate one
for generating the statistics on the CPU.
Add a new name argument to the Benchmark class descriptor and print this
out when printing the benchmark result.
Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
When calling `Debayer::process()` from `SoftwareIsp::process()`, the
`DebayerParams` object is copied multiple times:
(1) call of `BoundMethodMember<...>::activate()`
inside `Object::invokeMethod()`
(2) constructor of `BoundMethodArgs<...>`
inside `BoundMethodMember<...>::activate()`
(3) call of `BoundMethodMember<...>::invoke()`
inside `BoundMethodArgs::invokePack()`
(4) call of the actual pointer to member function
inside `BoundMethodMember::invoke()`
While compilers might avoid one or two of the above copies, this is still
not ideal. By making `Debayer::process()` take the parameter object by
const lvalue reference, only the copy in the `BoundMethodArgs` constructor
remains. So do that.
Before:
[0:12:51.133836595] [12424] DEBUG SoftwareIsp software_isp.cpp:399 params=0x7d0a691f57d0
copy from 0x7d0a691f57d0 into 0x7baa65f2bf30
copy from 0x7baa65f2bf30 into 0x7c6a69209758
copy from 0x7c6a69209758 into 0x7baa63223930
copy from 0x7baa63223930 into 0x7baa63223a70
[0:12:51.134559602] [12426] DEBUG eGL debayer_egl.cpp:538 params=0x7baa63223a70
771.099877 (30.06 fps) cam0-stream0 seq: 000031 bytesused: 8666112
After:
[0:13:42.861691943] [12543] DEBUG SoftwareIsp software_isp.cpp:399 params=0x7cfaad5f57d0
copy from 0x7cfaad5f57d0 into 0x7c5aad609758
[0:13:42.862453917] [12545] DEBUG eGL debayer_egl.cpp:538 params=0x7c5aad609758
822.827388 (30.02 fps) cam0-stream0 seq: 000031 bytesused: 8666112
Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
The default CCM in uncalibrated.yaml is just an identity transformation
and has been enabled by default only to always provide a correction
matrix to GPU ISP. It slows down CPU ISP when CCM is not used.
Now, when a default correction matrix is always provided to GPU ISP, we
can disable the Ccm algorithm in uncalibrated.yaml again. The check for
ccmEnabled in GPU ISP is no longer needed and it must be removed in
order not to fail when Ccm algorithm is not enabled. ccmEnabled flag is
still needed in CPU ISP where the processing differs based on whether
CCM is present or not.
Reviewed-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
Reviewed-by: Robert Mader <robert.mader@collabora.com>
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
The Lut algorithm is not really an algorithm. Moreover, algorithms may
be enabled or disabled but with Lut disabled, nothing will work.
Let's move the construction of lookup tables to CPU debayering, where it
is used. The implied and related changes are:
- DebayerParams is changed to contain the real params rather than lookup
tables.
- contrastExp parameter introduced by GPU ISP is used for CPU ISP too.
- The params must be initialised so that debayering gets meaningful
parameter values even when some algorithms are disabled.
- combinedMatrix must be put to params everywhere where it is modified.
- Matrix changes needn't be tracked in the algorithms any more.
- CPU debayering must watch for changes of the corresponding parameters
to update the lookup tables when and only when needed.
- Swapping red and blue is integrated into lookup table constructions.
- gpuIspEnabled flags are removed as they are not needed any more.
Reviewed-by: Robert Mader <robert.mader@collabora.com>
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Acked-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
The black level offset subtracted in AWB is wrong. It assumes that the
stats contain sums of the individual colour pixels. But they actually
contain sums of the colour channels of larger "superpixels" consisting
of the individual colour pixels. Each of the RGB colour values and the
computed luminosity (a histogram entry) are added once to the stats per
such a superpixel. This means the offset computed from the black level
and the number of pixels should be used as it is, not divided.
The patch fixes the subtracted offset. Since the evaluation is the same
for all the three colours now, the individual class variables are
replaced with a single RGB variable.
Fixes: 4e13c6f55b ("Honor black level in AWB")
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Mesa surfaceless platform appears to be a better fit for the use-case at hand:
1. Like GBM it is Mesa specific, so no change in supported setups is
expected. If ever required, a fallback to the generic device platform
could be added on top.
2. It leaves the complexity of selecting a renderer device to the
driver, reducing code and dependencies.
3. It allows to use llvmpipe / software drivers without dri device,
which can be useful on CI or debugging (with LIBGL_ALWAYS_SOFTWARE=1).
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # sm8250/rb5, x1e/Dell Insprion14p
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Tested-by: Milan Zamazal <mzamazal@redhat.com> # TI AM69
Tested-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> # ThinkPad X1 Yoga Gen 7 + ov2740
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
In some cases the GPU can deliver 15x performance in Debayer with the
CCM on, reference hardware Qualcomm RB5 with IMX512 sensor.
Given this large performance difference it makes sense to make GPUISP
the default for the Software ISP.
If LIBCAMERA_SOFTISP_MODE is omitted gpu will be the default. If
libcamera is compiled without gpuisp support, CPU Debayer will be used.
It is still possible to select CPU mode with LIBCAMERA_SOFISP_MODE=cpu.
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen 6 (arm64) ov02c10 + X1c gen 12 ov08x40
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> # Lenovo X13s
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
If GPUISP support is available make it so an environment variable can
switch it on.
Given we don't have full feature parity with CPUISP just yet on pixel
format output, we should default to CPUISP mode giving the user the option
to switch on GPUISP by setting LIBCAMERA_SOFTISP_MODE=gpu
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen 6 (arm64) ov02c10 + X1c gen 12 ov08x40
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> # Lenovo X13s
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Make getInputConfig and getOutputConfig static so as to allow for
interrogation of the supported pixel formats prior to object instantiation.
Do this so as to allow the higher level logic make an informed choice
between CPU and GPU ISP based on which pixel formats are supported.
Currently CPU ISP supports more diverse input and output schemes.
Acked-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen 6 (arm64) ov02c10 + X1c gen 12 ov08x40
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> # Lenovo X13s
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
In order to have Debayer::start() tell the eGL shader compilation routine what
the input and output pixel format is, we need to have a copy of the
selected format available. Add variables to the inputConfig and
outputConfig structures to allow tracking of this data for later use.
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen 6 (arm64) ov02c10 + X1c gen 12 ov08x40
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> # Lenovo X13s
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Pass contrastExp as calculated in lut to debayer params not the raw
contrast. This way we calculate contrastExp once per frame in lut and pass
the calculated value into the shaders, instead of passing contrast and
calculating contrastExp once per pixel in the shaders.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Tested-by: Robert Mader <robert.mader@collabora.com>
Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen 6 (arm64) ov02c10 + X1c gen 12 ov08x40
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> # Lenovo X13s
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
In order to initialise and deinitialise gpuisp we need to be able to setup
EGL in the same thread as Debayer::process() happens in.
This requires extending the Debayer object to provide start and stop
methods which are triggered through invokeMethod in the same way as
process() is.
Introduce start() and stop() methods to the Debayer class. Trigger those
methods as described above via invokeMethod. The debayer_egl class will
take care of initialising and de-initialising as necessary. Debayer CPU
sees no functional change.
Per feedback from Barnabas the stop method is using blocking
synchronisation and thus we drop ispWorkerThread_.removeMessages().
[bod: Made method blocking not queued per Robert's bugfixes]
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Move the initialisation of Bayer params and CCM to a new constructor in the
Debayer class.
Ensure we call the base class constructor from DebayerCpu's constructor in
the expected constructor order Debayer then DebayerCpu.
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
The DebayerCpu class has a number of variables, embedded structures and
methods which are useful to DebayerGpu implementation.
Move relevant variables and methods to base class.
Since we want to call setParams() from the GPUISP and reuse the code in
the existing CPUISP as a first step, we need to move all of the
dependent variables in DebayerCPU to the Debayer base class including
LookupTable and redCcm_.
The DebayerEGL class will ultimately be able to consume both the CCM and
non-CCM data.
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Add a method to the SwstatsCpu class to process a whole Framebuffer in
one go, rather then line by line. This is useful for gathering stats
when debayering is not necessary or is not done on the CPU.
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[bod: various rebase splats fixed]
[bod: Added constructor Doxygen header]
[bod: Squashed a fix from Hans to calculate stats on every 4th frame]
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
patternSize_ is a private variable and its meaning is already documented
in the patternSize() getter documentation.
Move the list of valid sizes to the patternSize() getter documentation
and drop the patternSize_ documentation.
While at it also add 1x1 as valid size for use with future support
of single plane non Bayer input data.
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Update the documentation of the statsProcessFn() / processLine0() src[]
pointer argument to take into account that swstats_cpu may also be used
with planar input data or with non Bayer single plane input data.
The statsProcessFn typedef is private, so no documentation is generated
for it. Move the new updated src[] pointer argument documentation to
processLine0() so that it gets included in the generated docs.
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Milan Zamazal <mzamazal@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
For debugging purposes, threads can be assigned a name, which eases
distinguishing between them in e.g. htop or gdb. This uses a
Linux-specific API for now which is limited to 15 characters (+ null
terminator), so truncation is done and names for existing thread
instantiations were chosen to be consise.
[Kieran: Apply checkstyle suggestions, rebase on proxy rework]
Signed-off-by: Schulz, Andreas <andreas.schulz2@karlstorz.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com>
Reviewed-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>