1. Introduction
Digital signage is a display device that provides a variety of useful information and is already installed in many places such as streets and public facilities. Visible light communication (VLC) schemes, which add a data transmission function to the digital signage and enable mobile terminals to receive the data information, have been studied. The concept of VLC between a digital signage and an image sensor is shown in Fig. 1. The transmitter side invisibly superimposes data information signals invisibly to the human eye on the luminance or chromaticity component of displayed image contents of the digital signage. The receiver side receives the data information signals by capturing the digital signage with the image sensor of the mobile terminal (e.g., smartphone camera) and applying image processing. This allows users viewing the digital signage to receive value-added data information, such as local and real-time advertisements, events, disaster information, and augmented reality, etc. via users' mobile terminals, in addition to the displayed image content itself.
This VLC system has the advantage that it can be realized with existing devices and does not require any additional or modified hardware. However, there is a trade-off relationship between the communication quality and the visual quality of the displayed image contents after data signal superimposition. If the data information signals are superimposed on the image contents with strong signal intensity to improve the communication quality, the visual quality is degraded, and the superimposition of the data information signals is easily perceived by the human eye. Conversely, if the data information signals are superimposed with weak signal intensity to improve the visual quality, the data information cannot be received correctly. Several proposals have been made to solve this problem.
One method is to transmit data signals by placing multiple markers that change color at a slow speed that is difficult for the human eye to perceive [1]; however, the data rate is as low as 10 bps. Another method is to modulate the backlight of the digital signage at a high speed that is difficult for the human eye to perceive [2]; however, it requires a modification of the digital signage hardware. In the research focusing on the method of superimposing data signals at the transmitter side, modulation methods using wavelet transform [3] and discrete cosine transform [4], which modulate data signals with frequency components, have been studied. In addition, modulation methods that reduce the visual quality degradation by superimposing data signals with a color component that is difficult for the human eye to perceive have been studied [5], [6].
In the previous studies of [5], [6], the data information was superimposed on the displayed image using the characteristic of the human eye, which is sensitive to changes in the luminance component but insensitive to changes in the chromaticity component. These methods used YCbCr color space (Y: luminance component, Cb: chromaticity component of redness-greenness direction, Cr: chromaticity component of yellowness-blueness direction) for modulation, and the modulation method using Cr or Cb chromaticity component was superior to that using RGB components and Y component. However, the YCbCr color space does not fully take into account the characteristics of human color perception. Thus, even if the signal intensity is the same, the amount of visual change varies depending on the modulated color. As a result, the superimposed data signals may be easily perceived. To improve this, we have proposed a modulation method with perceptually uniform color space and showed that it provides better visual quality at the same communication quality [7]. However, the evaluation of the communication quality and the visual quality was only performed for some specific image contents and was not sufficient for verification. In addition, the above modulation methods were based on only one color component, and whether modulation with the redness-greenness or yellowness-blueness chromaticity component performs better depended on the image contents. Thus, the disadvantage is that the optimal chromaticity component should be selected according to the image contents.
In this paper, we clarify the superiority of the modulation method with chromaticity components in perceptually uniform color space by evaluating the communication and visual quality using the ultra-high definition/wide-color-gamut standard test images of the Institute of Image Information and Television Engineers (ITE). In addition, we propose a novel communication method that simultaneously modulates the same data signal with two chromaticity components and uses diversity combining (selective combining and maximum ratio combining, which are generalizations of [8]) in the demodulation, and show that the communication quality is improved for almost all the standard images at the same visual quality. These verifications are performed through a trade-off evaluation of the communication quality, measured experimentally in terms of bit error rate, and the visual quality, calculated quantitatively in terms of color difference defined by the Commission Internationale de l'Eclairage (CIE).
This paper is organized as follows: Section 2 describes the modulation and demodulation process based on the conventional method using one chromaticity component. Section 3 describes the details of the conventional modulation method using the chromaticity components of the YCbCr color space and the proposed modulation method using the chromaticity components of the uniform color space. Section 4 describes the proposed diversity modulation/demodulation method using two chromaticity components. Section 5 shows the experimental evaluation and discussion of the communication and visual quality. Section 6 summarizes the conclusions.
2. System Model
The system model is shown in Fig. 2. This paper extends the following points to the conventional method [5] using one chromaticity component, but for the sake of fairness, we also apply them to the conventional methods.
- BPSK (Binary Phase Shift Keying)-based modulation, in which data bits 0 and 1 are modulated into baseband signals with amplitudes of \(-\alpha/2\) and \(\alpha/2\), is used instead of OOK (On Off Keying)-based modulation, in which data bits 0 and 1 are modulated into baseband signals with amplitudes of 0 and \(\alpha\), because the performance differs in the positive and negative directions of the modulated color component depending on the image contents.
- Demodulation performance is improved by performing inverse gamma correction on captured images, because images input to the digital signage are displayed with gamma correction. Here, the display gamma is assumed to be 2.2, which is the typical value in the sRGB standard for general purpose signage displays and image processing libraries such as OpenCV.
2.1 Modulation
The data information is a binary symbol sequence \(d(i)\), where \(i\) indicates the frame number in the time direction. The differentially coded data matrix \(c(i) = d(i) \oplus d(i-1)\) is transformed into a data signal by mapping it to \(M \times N\) rectangular cells consisting of multiple pixels on the same resolution image as the input image content. The cells corresponding to bit 0 and bit 1 have pixel values of \(-\alpha/2\) and \(\alpha/2\), respectively, where \(\alpha\) is an arbitrary signal intensity. This means that in the difference between successive frames, a cell in which the pixel value has not changed represents the data bit 0, and a cell in which the pixel value has changed by \(\alpha\) represents the data bit 1. The reason for the differential coding is that as described in 2.2, the difference between successive captured frames is performed at the receiver to remove the image content component and extract only the data signal.
The image content represented in RGB color space (sRGB) is converted to a desired color space, and the data signal is added to a color component in the desired color space. As the desired color space, we use the YCbCr color space and the \({\rm J_za_zb_z}\) uniform color space (details are explained in Sect. 3). In the one chromaticity component modulation, the data signal is added to either the redness-greenness or yellowness-blueness chromaticity component.
After data superimposition, the image is displayed on the digital signage after inverse conversion to the original sRGB color space. Here, the digital signage hardware performs gamma correction before display, and the transmitted image is displayed as gamma-corrected RGB colors (linear RGB).
2.2 Demodulation
After inverse gamma correction, the captured image is converted from the sRGB color space to the same color space used in the modulation, and only the chromaticity component on which the data signal is superimposed is extracted as an image \(F(i)\). For the sake of simplicity, the data signal portion in the captured image is assumed to be known. By subtracting \(F(i)\) from the previous \(F(i-1)\), the image content component is removed, and difference images are extracted: \(D_{rg}(i)\) as the redness-greenness chromaticity component or \(D_{yb}(i)\) as the yellowness-blueness chromaticity component.
The averaged images \(\bar{D}_{rg}(i)\) or \(\bar{D}_{yb}(i)\) are generated from \(D_{rg}(i)\) or \(D_{yb}(i)\) by averaging pixel values within each \(M \times N\) cell and taking the absolute value. \(\bar{D}_{rg}(i)\) or \(\bar{D}_{yb}(i)\) corresponding to the modulated chromaticity component is binary-thresholded using cell threshold values, and thus demodulated data \(\hat{d}_{rg}(i)\) or \(\hat{d}_{yb}(i)\) is obtained. Here, the receiver does not know the signal intensity of \(\alpha\), and the cell threshold values are determined from the pilot frames. From two transmitted pilot frames that alternately transmit the cells corresponding to bit 0 and bit 1, the average pixel value at the same cell position of the two pilot frames is set as the threshold value for each cell and each chromaticity component.
3. Modulation Using YCbCr Color Space and \({\rm J_za_zb_z}\) Uniform Color Space
The modulation and demodulation process use the YCbCr color space or the \({\rm J_za_zb_z}\) uniform color space. As described in 2.1, when the data signal is modulated with signal intensity \(-\alpha/2 \Leftrightarrow \alpha/2\), along the chromaticity component axis in the desired color space, the pixels in the cell corresponding to bit 0 are added with the pixel value \(-\alpha/2\) and the pixels in the cell corresponding to bit 1 are added with the pixel value \(\alpha/2\). For example, when modulating using the YCbCr color space, a pixel value with \((Y, Cr, Cb)=(0.1, 0.2, 0.3)\) is shifted to \((0.1, 0.2\pm\alpha/2, 0.3)\) in the case of Cr component modulation and to \((0.1, 0.2, 0.3\mp\alpha/2)\) in the case of Cb component modulation. Note that since the direction of Cb in the axis is opposite to that of \({\rm b_z}\) in the \({\rm J_za_zb_z}\) uniform color space, the positive and negative signal intensities in the Cb component are reversed in the modulation to align the axes.
Details of the YCbCr color space and the \({\rm J_za_zb_z}\) uniform color space are described below.
3.1 YCbCr Color Space
The YCbCr color space is a coordinate system that is defined by linear transformation from the sRGB color space as Eqs. (1)-(3) and has Y (luminance component), Cr (redness-greenness chromaticity component: red in the positive direction and green in the negative direction), and Cb (yellowness-blueness chromaticity component: blue in the positive direction and yellow in the negative direction) as axes. Converting an image from the sRGB color space to the YCbCr color space (as well as other color spaces) means converting each pixel of the image.
Although the sRGB and YCbCr color spaces are widely used in image processing, the difference between colors represented in these color spaces is different from the difference between colors perceived by the human eye. Therefore, even when modulated with the same signal intensity, some colors are easier for the human eye to perceive the signal superimposition.
\[\begin{align} &Y = 0.299R + 0.587G + 0.114B \tag{1} \\ &Cr = 0.713(R-Y) +0.5 \tag{2} \\ &Cb = 0.564(B-Y) +0.5 \tag{3} \end{align}\] |
3.2 \({\rm J_za_zb_z}\) Color Space
The \({\rm J_za_zb_z}\) uniform color space [9] is one of the uniform color spaces in which the color differences perceived by the human eye as having the same intensity correspond to the same distances in the color space. It is defined by nonlinear transformation from the linear RGB color space as Eqs. (4)-(9) and has \({\rm J_z}\) (luminance component), \({\rm a_z}\) (redness-greenness chromaticity component: red in the positive direction and green in the negative direction), and \({\rm b_z}\) (yellowness-blueness chromaticity component: yellow in the positive direction and blue component in the negative direction) as axes. The \({\rm J_za_zb_z}\) uniform color space is superior to other uniform color spaces in perceptual uniformity, linearity of hue, and computational complexity, etc.
When modulating using the \({\rm J_za_zb_z}\) color space, it is expected that the perceptibility of signal superimposition is independent of original color for the same signal intensity. However, since the modulation is performed by the nonlinear color space transformation, the amount of change in RGB values is different for each original color, and distortion in the received signal may degrade the demodulation performance.
\[\begin{align} & \begin{bmatrix} X \\Y \\Z \end{bmatrix} = \begin{bmatrix} 0.412453 &0.357580 &0.180423 \\ 0.212671 &0.715160 &0.072169 \\ 0.019334 &0.119193 &0.950227 \\ \end{bmatrix} \begin{bmatrix} R' \\G' \\B' \end{bmatrix} \tag{4} \\ & \begin{bmatrix} X' \\Y' \end{bmatrix} = \begin{bmatrix} bX \\gY \end{bmatrix} - \begin{bmatrix} (b-1)Z \\ (g-1)X \end{bmatrix} \tag{5} \\ & \begin{bmatrix} L \\M \\S \end{bmatrix} = \begin{bmatrix} 0.41478972 &0.579999 &0.0146480 \\ -0.2015100 &1.120649 &0.0531008 \\ -0.0166008 &0.264800 &0.6684799 \\ \end{bmatrix} \begin{bmatrix} X' \\Y' \\Z \end{bmatrix} \tag{6} \\ & \{L',\ M',\ S'\} = \left( \frac{c_1 + c_2 \left(\frac{\{L,\ M,\ S\}}{10000}\right)^n}{1 + c_3 \left(\frac{\{L,\ M,\ S\}}{10000}\right)^2} \right)^p \tag{7} \\ & \begin{bmatrix} I_z \\a_z \\b_z \end{bmatrix} = \begin{bmatrix} 0.5 &0.5 &0 \\ 3.524000 &-4.066708 &0.542708 \\ 0.199076 &1.096799 &-1.295875 \\ \end{bmatrix} \begin{bmatrix} L' \\M' \\S' \end{bmatrix} \tag{8} \\ & J_z=\frac{(1+d)I_z}{1+dI_z}-d_0 \tag{9} \end{align}\] |
where
\[\begin{aligned} &b=1.15,\ g=0.66, \nonumber\\ &c_1=3424/2^{12},\ c_2=2413/2^7, c_3=2392/2^7,\nonumber\\ &n=2610/2^{14},\ p=1.7\times2523/2^5,\nonumber\\ &d=-0.56,\ d_0=1.6295499532821566\times10^{-11}. \nonumber \end{aligned}\] |
4. Diversity Method Using Two Chromaticity Components
The proposed method simultaneously modulates the same data signal with two chromaticity components and uses diversity combining in the demodulation. It is expected that the communication quality will be improved when the two chromaticity components compensate each other for image-dependent weak modulation colors.
4.1 Modulation
In the two chromaticity components modulation, the same data signal is added to both the redness-greenness and yellowness-blueness chromaticity components in the modulation process described in 2.1. This results in a transmission image in which the same data signal is superimposed with the same signal intensity \(-\alpha/2 \Leftrightarrow \alpha/2\) for both chromaticity components of the image content. The concept of the modulation process is shown in Fig. 3 for the \({\rm J_za_zb_z}\) uniform color space, where in the YCbCr color space, the positive and negative signs of the signal intensity of the yellowness-blueness chromaticity component are reversed as described in 3.1.
4.2 Demodulation
In demodulation for the two chromaticity components, selective combining (SC) or maximum ratio combining (MRC) is applied to obtain diversity gain. The concept of the demodulation process after the difference extraction in each of the two chromaticity is shown in Fig. 4. It is applicable to both the YCbCr color space and the \({\rm J_za_zb_z}\) uniform color space because the demodulation is based on the absolute value of the average cell difference as described in 2.2.
In the SC, for each cell at the same position of \(\bar{D}_{rg}(i)\) and \(\bar{D}_{yb}(i)\), the cell value farther from the cell threshold value is selected for demodulation and is binary-thresholded using the corresponding threshold value, and thus demodulated data \(\hat{d}_{SC(rg,yb)}(i)\) is obtained.
In the MRC, weighted addition based on the received signal intensity is performed for each cell as follows:
\[\begin{aligned} &(s_{rg(m,n)}-T_{rg(m,n)})A_{rg(m,n)} \nonumber\\ & + (s_{yb(m,n)}-T_{yb(m,n)})A_{yb(m,n)} \nonumber \end{aligned}\] |
where \(s_{rg(m,n)}\) and \(s_{yb(m,n)}\) are the cell values at cell position \((m,n)\) of \(\bar{D}_{rg}(i)\) and \(\bar{D}_{yb}(i)\), and \(T_{rg(m,n)}\) and \(T_{yb(m,n)}\) are the cell threshold values calculated from the pilot frames. This formula means that the differences between the received value and the threshold value in each color component are added together after applying weights. \(A_{rg(m,n)}\) and \(A_{yb(m,n)}\) are the weights, given by the difference between the cell values of bit 0 and bit 1 of the pilot frame, and represent the received signal intensity corresponding to \(\alpha\). Thus, the color component with a larger received signal is weighted more. After the weighted addition, demodulated data \(\hat{d}_{MRC(rg,yb)}(i)\) is obtained by performing a binary decision using the threshold value of 0.
The SC and the MRC differ in that in each cell, the SC reflects only one of the two chromaticity components, while the MRC reflects the values of both chromaticity components. The SC and MRC are proposed here to see if this difference affects the demodulation performance.
5. Performance Evaluation
The digital signage and image sensor-based VLC must be implemented without interfering with the displayed image content itself. Therefore, a trade-off evaluation based on both communication quality and visual quality is required.
However, it is difficult to mathematically discuss the superiority of the \({\rm J_za_zb_z}\) uniform color space because it involves nonlinear transformations. While the YCbCr color space is a linear transformation to the sRGB color space, the \({\rm J_za_zb_z}\) uniform color space is a nonlinear transformation because it represents color differences perceived by the human eye. While the Cb and Cr value ranges corresponding to the sRGB value range of 0 to 1 are \(-0.5\) to \(0.5\), the \({\rm a_z}\) value range is \(-0.093\) to \(0.100\) and the \({\rm b_z}\) value range is \(-0.157\) to \(0.116\), so the axes are very different. Therefore, it is not possible to discuss a comparison between the YCbCr modulation and the \({\rm J_za_zb_z}\) modulation using the same signal intensity. Therefore, we compare the communication quality of the YCbCr modulation and the \({\rm J_za_zb_z}\) modulation by setting the signal intensity based on the criterion that the image after modulation has the same visual quality. Here, the visual quality is evaluated quantitatively by using the color difference formula as described in 5.1, and the communication quality is evaluated experimentally by using the bit error rate as described in 5.2.
5.1 Visual Quality Evaluation Index
Since the visual quality deteriorates when data signals are superimposed on the image content, it is required to quantify the visual quality of the displayed image. In this paper, the visual quality is evaluated quantitatively by using the CIEDE2000 formula [10] for the color difference when bit 0 and bit 1 are modulated in the image content. The color difference is calculated for each pixel of the images before and after modulation, and the root mean square (RMS) of the color difference values is defined as the visual quality.
The CIEDE2000 is a standard for quantifying the amount of visual change in color. It defines a color difference formula that calculates the corrected distance between colors in the CIELAB color space, but does not constitute a color space. The modulation methods in this paper modulate the color of the image content, and therefore the CIEDE2000 is used as an evaluation index to quantitatively measure the amount of visual change caused by modulation. By comparing the value of the CIEDE2000 for each modulation method, the method with a lower value means better visual quality. Especially, when the color difference value is less than 1, it is known to be difficult for the human eye to perceive.
The values of the signal intensity \(\alpha\) were calculated numerically to obtain the same color difference value for each modulation method for easy comparison, followed by communication experiments. The same value of signal intensity is used for all pixels, so the signal intensity does not change depending on the original color value of the image content. Since the signal intensity is determined per image, there is a variation in the color difference values for each pixel. In general, color difference evaluation is based on the average value of the color difference. However, even if the signal intensity is set so that the average color difference is 1, there may be pixels with a color difference larger than 1 due to the variation. Therefore, in this paper, the color difference is evaluated more strictly based on the RMS, which includes the variation in addition to the average value, instead of the average value. As mentioned above, the RMS value is set to be the same for each modulation method. The same applies to one-chromaticity component modulation and two-chromaticity component modulation. The signal intensity is set to be the same color difference, so the signal intensity is different for each modulation method.
The \({\rm J_za_zb_z}\) uniform color space is a color space constructed based on the CIEDE2000, and the \({\rm J_za_zb_z}\) modulation is expected to reduce the variation in the color difference compared to the YCbCr modulation. Table 1 shows examples of the relationship between the signal intensity and the color difference calculated for 16777216 colors (all 24-bit RGB colors). From this table, we can see that the standard deviation is larger for the Cb and Cr modulations, even if the average of the color difference is the same 1. In particular, the color difference variation is reduced by half in the \({\rm b_z}\) modulation. This is because the chromaticity difference in the blue direction has a large discrepancy from the color difference perceived by the human eye. When the visual quality is aligned at the same RMS value, the \({\rm J_za_zb_z}\) modulation has a gain over the YCbCr modulation in terms of the signal intensity.
5.2 Communication Quality Evaluation Index
The communication quality is experimentally evaluated using the bit error rate. The communication quality depends on the signal intensity, the nonlinearity of color space conversion, and the characteristics of the communication channel. The communication channel can be regarded as a single communication channel from the R, G, and B values of the transmitted image to the R, G, and B values of the received image, and includes the emission characteristics of the display, the characteristics of the lens, the characteristics of the image sensor, and the effects of background light. This communication channel is the same regardless of the modulation method. The channel characteristics is described in 5.3.
The experimental landscape is shown in Fig. 5. In this experiment, it is assumed that both the transmitter and the receiver are stationary. The transmitter consists of a computer and a display, LG Electronics Japan OLED55CXPJA, calibrated by X-Rite i1Display Pro Plus and CalMAN AutoCAL. Figure 6 shows the spectral power distribution of the displayed white color, measured with a spectrometer, SEKONIC C-800 SPECTROMASTER. The receiver consists of a computer and a camera, The Imaging Source DFK37BUX178 with VS-0818VM 8 mm lens, mounted on a tripod. The camera has a Sony 1/1.8-inch IMX178 CMOS image sensor with a rolling shutter, whose spectral sensitivity is shown in Fig. 7. It has almost the same specifications as image sensors used in smartphones. In addition, it is capable of RAW RGB capture, and no image compression is applied in the evaluation. The color temperature of the white point of the transmitter and the white balance of the receiver are set to the CIE standard illuminant D65.
Table 2 shows the experimental specifications of the communication quality evaluation. The illuminance was measured in the same position as the light receiving part of the camera, with the light-receiving part of the illuminance meter pointing straight up. The timing of updating the displayed image and capturing the image is assumed to be synchronized, i.e., errors due to asynchronous between the display and the camera are excluded from the evaluation, but the exposure time is set to \(1/30\) [s] assuming \(30\) fps capturing. In order to remove the influence of computing resources, the demodulation is not processed in real time. After capturing the transmitter's display with the receiver's camera, the portion of the transmission image is extracted and demodulated using the computer.
The image content used for the evaluation is the set of standard test images [12] defined by the ITE for evaluating the performance of imaging systems and schemes and their image quality. The total of ten images shown in Fig. 8 were selected by the ITE to be suitable for evaluating different resolutions, tones, colors, degradation due to digital processing, etc. with a small number of images, and will also be suitable for evaluating the modulation methods. The experiment assumes that the receiver knows the first and second frames as the pilot frames. Even if this is not the case, pilot frames and data frames can be easily separated with a simple threshold because the pilot frames, unlike the data frames, have alternating bit 0 and 1 cells. Since the threshold is set by the pilot frames for each measurement, the bit error rate is calculated as the average of 10 measurements of 70 data frames, transmitted at the same signal intensity with the same data sequence, where the threshold was set by two additional pilot frames per measurement.
In this paper, we focus on the difference in modulation performance between the YCbCr and \({\rm J_za_zb_z}\) uniform color spaces and the difference in the chromaticity components used. Therefore, the evaluation should exclude any communication errors other than those caused due to the color space and chromaticity components used for modulation. If the transmitter and receiver are asynchronous, inter-frame interference will occur in the received image, resulting in a communication error that mainly depends on the frame rate difference between the transmitter and receiver. If video is used instead of a still image, errors will occur in the difference process during demodulation, resulting in a communication error that mainly depends on the difference between video frames. If the position of the transmitter is not known and detected from the captured image, and if the camera is not stationary but moving, there will be a misalignment of data frames in the captured image occurs, resulting in a communication error that mainly depends on the accuracy of the position detection. The above mentioned communication errors are not caused by the color space and chromaticity component. The above conditions are issues that should be evaluated when designing a system, but are beyond the scope of this paper.
5.3 Channel Characteristics
Considering the transmitted image to the received image as a single communication channel, the noise added to the received image is mainly due to variations caused by individual LEDs of the digital signage display, background light, and noise generated by the image sensor. In addition to noise, the received image is affected by interference between R, G, and B due to the emission characteristics of the display and the sensitivity characteristics of the image sensor, as well as inter-pixel interference due to the lower received resolution relative to the transmitted image resolution and lens aberrations. As discussed below, the noise can be approximated collectively as Gaussian noise, and the communication channel characteristics can be approximated as linear transformations to linear RGB values.
Additional experimental evaluation was performed to clarify channel characteristics, including statistical characteristics of channel noise. We evaluated the relationship between the R, G, and B values of the transmitted images and the R, G, and B values of the received images by displaying a gradation image that changes linearly from white to black 10 times and capturing it with the same settings as in the communication experiment. Figure 9(a) shows the scatter diagram of pixel values with the R, G, and B values (sRGB) of the transmitted image on the horizontal axis and the R, G, and B values (sRGB) of the received image on the vertical axis. Since the relationship is clearer when the horizontal and vertical axes are represented by the gamma-corrected R, G, and B values on the display (linear RGB) and the R, G, and B values of the received image before inverse gamma correction (linear RGB), this figure is shown in (b). In addition, the distribution of the received R, B, and G values for several transmitted values is shown in Fig. 10.
Figure 9(a) shows that the received RGB values are collapsed in the region where the transmitted RGB values are low and close to black. This implies that the background light makes it difficult to distinguish transmitted values below about 0.2. In addition, the received values are saturated with transmitted values above about 0.9. This is due to the dynamic range of the image sensor. From Fig. 9(b), the received values have linear characteristics in linear RGB, except for the saturated region. In addition, from Fig. 10, the distribution of the received values is close to a Gaussian distribution and the variance is approximately constant.
The interference between R, G, and B can be modeled using a well-known method known as color calibration. Specifically, by displaying a Macbeth color chart [13] and capturing it, the relationship between the transmitted and received RGB is modeled as a \(3\times3\) linear matrix transformation. Figure 11 shows the displayed Macbeth color chart and the captured Macbeth color chart. (a) shows the gamma-corrected pixel values of the Macbeth color chart and (b) shows the pixel values of the captured image before inverse gamma correction, both displayed as linear RGB. The transformation matrix is obtained by minimizing the mean square error of the linear R, G, and B values between (a) and (b) for the chart's 24 colors.
To summarize the above, the communication channel can be modeled as follows. The RGB value (sRGB) of the input transmitted image is gamma-corrected, followed by the perspective transformation and Gaussian blurring. Here, the experimental environment can be roughly approximated by a perspective transformation from the transmitted resolution of \(3840\times2160\) pixel (cell size: \(240\)) to the received resolution of \(1408\times792\) pixel (cell size: \(88\)) and a Gaussian blur with kernel size of \(3\times3\) and standard deviation of \(3.0\). Then, RGB transformation including white Gaussian noise is performed as shown in Eq. (10), and finally inverse gamma correction is performed, resulting in the output receive image (sRGB), where the output R, G, and B values are clipped to the range of 0 to 1.
\[\begin{align} \begin{bmatrix} R'_{Rx} \\G'_{Rx} \\B'_{Rx} \end{bmatrix} = \begin{bmatrix} C_{rr}&C_{gr}&C_{br} \\ C_{rg}&C_{gg}&C_{bg} \\ C_{rb}&C_{gb}&C_{bb} \\ \end{bmatrix} \begin{bmatrix} R' \\G' \\B' \end{bmatrix} + \begin{bmatrix} n_{r} \\n_{g} \\n_{b} \end{bmatrix} \tag{10} \end{align}\] |
where
\[\begin{aligned} \begin{bmatrix} C_{rr}&C_{gr}&C_{br} \\ C_{rg}&C_{gg}&C_{bg} \\ C_{rb}&C_{gb}&C_{bb} \\ \end{bmatrix} = \begin{bmatrix} 0.83&0.22&0.00 \\ 0.15&0.86&0.15 \\ 0.057&0.35&0.88 \\ \end{bmatrix}, \nonumber \end{aligned}\] |
\(n_{r}\), \(n_{g}\), and \(n_{b}\) are additive white Gaussian noises with mean values of 0.034, 0.033, 0.038 and standard deviations of 0.0081, 0.0068, and 0.0089, respectively.
5.4 Trade-Off Evaluation of the One Chromaticity Component Modulation
The trade-off evaluation is performed by using the calculated color differences and the bit error rate measurement results. For each of the image contents (a) to (j) shown in Fig. 8, Fig. 12 shows the color difference versus the bit error rate. These figures show the modulation and demodulation using one chromaticity component. The legend in the graphs shows the color components used for the modulation. The horizontal axis of the graph is the bit error rate, which indicates the communication quality, with lower values indicating better performance. The vertical axis of the graph is the color difference, which indicates the visual quality, with lower values indicating better performance. Therefore, the smaller the color difference with respect to the same bit error rate, or the smaller the bit error rate for the same color difference, the better the performance.
First, comparing the modulation using the YCbCr color space and the modulation using the \({\rm J_za_zb_z}\) uniform color space in the case with one chromaticity component in Fig. 12, the modulation using the \({\rm J_za_zb_z}\) uniform color space has better performance for both \({\rm a_z}\) and \({\rm b_z}\) modulation in (g) ChromaKey, (h) Sea, and (j) Ship. In other image contents, the \({\rm b_z}\) modulation is superior to the Cb modulation, while the \({\rm a_z}\) and Cr modulation varies from image to image. It can also be seen that in the modulation with one chromaticity component, the superiority or inferiority of the performance of the redness-greenness and yellowness-blueness chromaticity component modulation depends on the image contents. In other words, it is necessary to select the optimal chromaticity component according to the image contents.
The reason why the experimental results differ greatly depending on the image is due to the color component of the image content itself. This can be clarified by showing the relationship of the color components of the image content and the received signal intensity when bit errors occurred. Figure 13 shows the scatter diagram of signal intensity ratio and RGB color components of each cell in the experimental results of each one chromaticity component modulation of all standard image for signal intensity \(\alpha\) that results in a color difference of 1. The horizontal axes indicate the average value of R, G, and B in each cell, and the vertical axes indicate the ratio of the received signal intensity corresponding to bit 1 to the signal intensity \(\alpha\) in each chromaticity component. Note that as described in 2.2, a cell-by-cell averaged image of the chromaticity component is calculated from the captured images and the cell value of this chromaticity component image is defined as the received signal intensity. The points denoted by “all” are the points of each RGB value and the signal intensity ratio for all cells. The points denoted by “error” are the points where bit errors occurred. Since the standard test images contain a very wide range of color components, the characteristics of the communication errors resulting from the color components can be clarified from Fig. 13. Figure 13 shows that errors occur when the values of R, G, and B are small for all modulation methods. In addition, there is a correlation between the error points for Cr modulation and \({\rm a_z}\) modulation with respect to the value of R, and for Cb modulation and \({\rm b_z}\) modulation with respect to the value of B. As described in 5.3, the error points in RGB values below about 0.2 are mainly due to the influence of background light noise, and the error points in RGB values above about 0.9 are due to the saturation of the received values depending on the dynamic range of the image sensor. For the images that contain many such values in each RGB component, bit errors occur regardless of the YCbCr color space and the \({\rm J_za_zb_z}\) uniform color space, so the performance difference between the YCbCr modulation and the \({\rm J_za_zb_z}\) modulation becomes small. In particular, (f) Butterfly contains a lot of near-black color in the background, and many errors occur regardless of which color component is used. In fact, (f) Butterflies is included in the standard images as an image that can be used to check black levels of the display.
Fig. 13 Scatter diagrams of signal intensity ratio and error points for each one chromaticity component modulation. |
Based on the above, we pick up some image contents to illustrate that the experimental results of the one chromaticity component modulation depend on the image content. As the first example, we pick up (c) Moss and (g) ChromaKey. As shown in Fig. 12, at color difference 1, the component modulation is error-free for Moss, while ChromaKey causes many bit errors. The scatter diagrams for Moss and ChromaKey of the signal intensity ratio versus R in the component modulation are overlaid in Fig. 14. The points denoted by the image name indicate the cells in that image, and “error (ChromaKey)” indicates the cells where bit errors occurred in the case of ChromaKey. Figure 14 shows that ChromaKey has many cells with small R-values, and these cells have low signal intensity ratios, resulting in bit errors. On the other hand, Moss has no cells with such small R values, so no bit error occurs. Thus, the difference in performance of the Cr component modulation in these images was caused by the R-value of the image content. As the next example, we pick up (h) Sea and (j) Ship. As shown in Fig. 12, at color difference 1, the component modulation is error-free for Ship, while Sea causes many bit errors. In the same way as in Fig. 14, The scatter diagrams for Ship and Sea of the signal intensity ratio versus B in the \({\rm b_z}\) component modulation are overlaid in Fig. 15. Figure 15 shows that Sea has many cells with large B-values, and these cells have low signal intensity ratios, resulting in bit errors. On the other hand, Ship has no cells with such large B values, so no bit error occurs. Thus, it can be seen that the difference in performance of the \({\rm b_z}\) component modulation in these images was caused by the B-value of the image content.
Fig. 14 Scatter diagrams for Moss and ChromaKey of signal intensity ratio versus R in Cr component modulation |
Fig. 15 Scatter diagrams for Ship and Sea of signal intensity ratio versus B in \({\rm b_z}\) component modulation. |
To validate the experimental results, computer simulations using the simplified communication channel model described in 5.3 were also evaluated. The simulation results obtained for the one chromaticity component modulation are shown in Fig. 16, as well as the Fig. 12 of the experimental results. It can be confirmed that the simulation results in Fig. 16 and the experimental results in Fig. 12 show similar characteristics. As mentioned in the discussion of the experimental results, the \({\rm b_z}\) modulation is superior to the Cb modulation in all images, while the superiority of the \({\rm a_z}\) and Cr modulation varies from image to image. The simulation results also show that the modulation is superior to the Cb modulation. However, there is a slight difference in the superiority relationship between the \({\rm a_z}\) and Cr modulation. This is because the channel characteristics (i.e., the relationship between R, G, and B) do not match perfectly between the simulation and experimental results, and the colors of the received images are slightly different, resulting in a performance difference in the \({\rm a_z}\) and Cr modulation. Note that discussing an exact simulation model of the same communication channel as in the experiment is beyond the scope of this paper, so the simulation model in this paper is limited to discussing the simplified model that is a linear approximation with a \(3\times3\) matrix in linear RGB. Although this is a well-known method known as color calibration, the relationship between R, G, and B in the real channel is not linear due to the spectral power distribution of the display and the spectral sensitivity of the camera. In addition, the experimental results include lens aberrations and reflected light from surrounding objects, while the simulation results do not. Therefore, the experimental and simulation results do not match perfectly, and the superiority relationship may be reversed.
5.5 Trade-Off Evaluation of the Two Chromaticity Component Modulation
For each of the image contents (a) to (j) shown in Fig. 8, Fig. 17 shows the color difference versus the bit error rate. These figures show the modulation and demodulation with two chromaticity components and diversity combining. The way to read the graph is the same as explained in 5.4. The legends SC and MRC indicate selective combining and maximum ratio combining, respectively, and the color components used are described in parentheses.
In the proposed method of modulation and diversity combining with two chromaticity components shown in Fig. 17, a comparison of the YCbCr modulation and the \({\rm J_za_zb_z}\) modulation shows that \({\rm J_za_zb_z}\) uniform color space has no advantage. Compared to the results of the one chromaticity component cases in Fig. 12, the performance of the diversity combining is as good as or better for all image contents. There is no significant difference in performance between the selective combining and the maximum ratio combining. These results show that the proposed diversity combining is a superior method that does not require the selection of the best color chromaticity components according to the image contents. However, when considering less than 1 as an imperceptible color difference value and less than 1 bit error per frame as an acceptable error rate, only (f) Butterflies is not able to achieve them, even if the proposed method is used. This is because (f) Butterfly contains a lot of near-black color in the background, and many bit errors occur regardless of which color component is used.
The effect of diversity demodulation with two chromaticity components appears when bit errors occurring in one chromaticity component can be compensated for by the other chromaticity component. In other words, the effect of diversity demodulation appears when the error points are different when each chromaticity component is demodulated singly. This can be clarified by showing the error points when demodulated with a single chromaticity component in a scatter diagram that shows the correlation of the received signal intensities of the two chromaticity components.
Figure 18 shows the scatter diagram of received signal intensities of the two chromaticity components in the experimental results of the diversity demodulation of each standard image. The vertical and horizontal axes indicate the received signal intensity of the respective chromaticity component. Note that as described in 4.2, a cell-by-cell averaged image of each of the two chromaticity components is calculated from the captured images and the cell values of these two chromaticity component images are plotted on the scatter diagram as the received signal intensities. To illustrate the effect of diversity demodulation, it is evaluated at the signal intensity at which the bit error occurs in each standard image. The black point denoted by “signal intensity” represents the signal intensity \(\alpha\) at the modulation and is the signal point corresponding to data bit 1. The origin of the scatter diagram is the signal point corresponding to data bit 0. Ideally, the points of received signal intensity should be concentrated at the point of origin and signal intensity, but the distribution is spread out due to the influence of communication channel. The points denoted by “signal 0” and “signal 1” are the receive signal points for data bits 0 and 1, respectively. The points denoted by “Cb error,” “Cr error”, “\({\rm a_z}\) error”, and “\({\rm b_z}\) error” show the points where bit errors occurred when demodulating with only a single chromaticity component, and the points denoted by “SC error” and “MRC error” show the points where bit errors occurred when diversity demodulation was performed.
Fig. 18 Scatter diagrams of received signal intensity and error points for each chromaticity component in the diversity demodulation. |
From Fig. 18, the error points that overlap when demodulating with only a single chromaticity component also cause bit errors in the diversity demodulation, while the error points that do not overlap can be improved by the diversity demodulation. Especially in Fig. 18(h) Sea, although many bit errors occur when demodulation is performed with only a single chromaticity component, the error points have little overlap in the two color components, and thus the diversity demodulation is highly effective. On the other hand, in the case of (f) Butterflies, the distribution is spread over small received signal intensity values, and most of the error points when demodulated with only a single chromaticity component are also overlapping. Therefore, the diversity demodulation is ineffective.
The reason why the performance of the YCbCr modulation can be improved and outperform the performance of the \({\rm J_za_zb_z}\) modulation in the diversity method using two chromaticity components is due to the signal intensity during modulation. To illustrate this, Table 3 shows the signal intensity that results in a color difference of 1 for each test image. As shown in Table 3, the signal intensity that results in a color difference of 1 varies depending on the image content. The signal intensity of the two chromaticity components modulation is often larger than that of the one chromaticity component modulation due to the nonlinearity of the color difference formula. Note that the signal intensities of \({\rm a_z}\) and \({\rm b_z}\) modulation are smaller than those of Cr and Cb modulation, but the comparison is meaningless because the range of possible values is different.
Comparing the signal intensity in the case of the one chromaticity component modulation with that in the case of the two chromaticity components modulation in Cr and Cb modulation, the signal intensity is almost the same or slightly larger in the case of the two chromaticity components modulation. Therefore, it can be said that the communication quality of the YCbCr modulation is simply improved by the diversity of the two chromaticity components.
On the other hand, comparing the signal intensity in the case of the one chromaticity component modulation with that in the case of the two chromaticity components modulation in \({\rm a_z}\) and \({\rm b_z}\) modulation, the signal intensity for \({\rm a_z}\) modulation is almost the same or slightly larger than that for the two chromaticity components modulation, while the signal intensity for the \({\rm b_z}\) modulation is smaller than that for the two chromaticity components modulation. Therefore, the \({\rm b_z}\) component in the two chromaticity components modulation is degraded in terms of communication quality compared to the one chromaticity component modulation. This is the reason why the \({\rm J_za_zb_z}\) modulation is inferior to the YCbCr modulation in the two chromaticity components modulation.
Here, the simulation comparison is omitted because the simulation results and the experimental results show similar characteristics like in the case of the one chromaticity component modulation.
6. Conclusions
In this study, the effectiveness of perceptually uniform color space for the modulation with chromaticity components in VLC between a digital signage and an image sensor was verified through a trade-off evaluation between communication quality and visual quality using various standard images. We also proposed a demodulation method using diversity combining for modulation with two chromaticity components and compared its performance with that of conventional methods using one chromaticity component.
This paper clarified the following two points:
- The \({\rm J_za_zb_z}\) modulation is superior to the YCbCr modulation in the one chromaticity component modulation.
- The two chromaticity components modulation with diversity demodulation is superior to the one chromaticity component modulation; however, the \({\rm J_za_zb_z}\) modulation is inferior to the YCbCr modulation in the two chromaticity components modulation.
First, the \({\rm J_za_zb_z}\) modulation has the advantage of better visual quality (RMS of the color differences before and after modulation) than the YCbCr modulation. However, as the \({\rm J_za_zb_z}\) uniform color space is a nonlinear transformation with respect to the sRGB color space, the communication quality may be inferior to the YCbCr modulation due to the nonlinear effect on the degradation of RGB values received through the communication channel. We compared the \({\rm J_za_zb_z}\) and YCbCr modulations by evaluating the trade-off between the visual quality and the communication quality, and showed that the \({\rm J_za_zb_z}\) modulation, especially the \({\rm b_z}\) modulation, is superior to the YCbCr modulation in the one chromaticity component modulation.
Secondly, the two chromaticity component modulation provides superior communication quality for the same signal intensity because it uses two chromaticity components to achieve diversity. However, the visual quality may be inferior to that of the one chromaticity component modulation because the data signal is superimposed on the two chromaticity components. We compared the one chromaticity component modulation and the two chromaticity components modulation using the same trade-off evaluation, and showed that the two chromaticity components modulation is superior to the one chromaticity component modulation for both YCbCr and \({\rm J_za_zb_z}\) modulation. In addition, as a result of the first and second evaluation, it was shown that the \({\rm J_za_zb_z}\) modulation is inferior to the YCbCr modulation in the two chromaticity components modulation. This is because, when the same signal intensity is used in both \({\rm a_z}\) and \({\rm b_z}\), the signal intensity that results in the same visual quality may be smaller than in the one chromaticity component modulation. In this paper, the same signal intensity was used to modulate the two chromaticity components, but the signal intensity is not necessarily the same for the two chromaticity components. How to assign the signal intensities to the two chromaticity components to improve performance is left for future research.
The following discussion also remains on how to assign the signal intensities. In this paper, the signal intensity was determined per image. However, it is theoretically possible to determine the signal intensity per pixel or per cell. One reason for determining the signal intensity per image rather than per pixel was that the modulation and demodulation methods are based on the conventional method [5], which determines the signal intensity per image. Another reason was that the color difference formula is nonlinear and complex, and it is difficult to calculate the signal intensity that results in the desired color difference value per pixel. It is easy to determine the signal intensity that results in the desired color difference value per image using a numerical search, but difficult per pixel. Since the signal intensity is uniquely determined by the desired communication or visual quality, the number of signal intensities to be set per image in our evaluation is 1 for per-image calculation, 16x9 for per-cell calculation, or 3840x2160 for per-pixel calculation. For a video image, the number of signal intensities is multiplied by the number of frames in the video. Since most digital signage applications will repeatedly display a predetermined image content, we assumed that these signal intensities only need to be calculated once when the image content is registered to the system. It is possible to calculate the signal intensities for all colors or all pixels in advance and have a look-up table. However, as is the case in our evaluation, high-definition image content is often 10-bit or higher RGB and 4K or higher, and the total number of colors in 10-bit RGB and pixels in a 4K image is about \(10^9\) colors and \(10^7\) pixels. In addition, our evaluation involves a total of 490 settings, including 10 types of the test images, 7 values of the color differences from 0.4 to 1.6, and 6 combinations of the color spaces, chromaticity components, and diversity, which makes the time required for the pre-calculation too large to realize the evaluation. How to quickly calculate the signal intensity that results in the desired color difference remains an open question. Changing the signal intensity per pixel or per cell to align color differences may reduce the variation of color differences in the transmitted image and improve the visual quality, but it affects the communication quality because the signal intensity varies per pixel or per cell. It is an interesting discussion for future research to compare what happens to the trade-off performance when the signal intensity is set per image, per cell, or per pixel.
Acknowledgments
A part of this work was supported by JSPS KAKENHI Grant Number 21K04047 and Joint Research Program of IMaSS, Nagoya University, Japan.
References
[1] K. Kuraki, S. Nakagata, R. Tanaka, and T. Anan, “Data transfer technology to enable communication between displays and smart devices,” FUJITSU Sci. Tech. J., vol.50, no.1, pp.40-45, Jan. 2014.
[2] H. Aoyama and M. Oshima, “Visible light communication using a conventional image sensor,” Proc. IEEE Consumer Commun. Netw. Conf. (CCNC), pp.103-108, Jan. 2015.
CrossRef
[3] Y. Lin, T. Wada, K. Mukumoto, and H. Okada, “Performance evaluation of information embedding schemes based on wavelet transform for parallel transmission visible light communication systems,” Proc. IEEE Global Conf. Consumer Electron. (GCCE), pp.1-2, Oct. 2017.
CrossRef
[4] R. Mushu, T. Wada, K. Mukumoto, and H. Okada, “A proposal of information embedding scheme based on discrete cosine transform in parallel transmission visible light communications,” Proc. IEEE Global Conf. Consumer Electron. (GCCE), pp.175-176, Oct. 2018.
CrossRef
[5] H. Okada, S. Sato, T. Wada, K. Kobayashi, and M. Katayama, “Preventing degradation of the quality of visual information in digital signage and image-sensor-based visible light communication systems,” IEEE Photon. J., vol.10, no.3, Art. no.7903509, April 2018.
CrossRef
[6] H. Okada, S. Yoshida, T. Wada, K. Kobayashi, and M. Katayama, “Overlaying a motion video with data information for digital signage and image-sensor-based visible light communication systems,” IEICE ComEX, vol.8, no.8, pp.323-328, Aug. 2019.
CrossRef
[7] K. Shimei, K. Kobayashi, and W. Chujo, “Data signal modulation based on uniform color space for digital signage and image sensor based visible light communication,” IEICE ComEX, vol.11, no.1, pp.26-32, Oct. 2021.
CrossRef
[8] K. Shimei, K. Kobayashi, and W. Chujo, “A study on modulation and diversity methods based on uniform color space for digital signage and image sensor-based VLC,” 7th IEEE ICC Workshop on Optical Wireless Communications, pp.556-561, May 2022.
CrossRef
[9] M. Safdar, G. Cui, Y.J. Kim, and M.R. Luo, “Perceptually uniform color space for image signals including high dynamic range and wide gamut,” Opt. Express, vol.25, no.13, pp.15131-15151, June 2017.
CrossRef
[10] G. Sharma, W. Wu, and E.N. Dalal, “The CIEDE2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations,” Color Res. Appl., vol.30, no.1, pp.21-30, Feb. 2005.
CrossRef
[11] DFK 37BUX178 Technical Reference Manual, The Imaging Source, April 2019.
[12] Ultra-high definition/wide-color-gamut standard test images, ITE, https://www.ite.or.jp/content/chart/uhdtv/
URL
[13] Free color checker chart, Acser123, https://upload.wikimedia.org/wikipedia/commons/4/4f/Color_Checker.pdf
URL