Register or Login To Download This Patent As A PDF
| United States Patent Application |
20010014182
|
| Kind Code
|
A1
|
|
FUNAYAMA, RYUJI
;   et al.
|
August 16, 2001
|
IMAGE PROCESSING APPARATUS
Abstract
An image processing apparatus includes a designating section for
designating an arbitrary region or an arbitrary position of an image; a
specifying section for specifying an object region which is present in
the designated region or position, and which can additionally be in a
vicinity of the designated region or position, from pixel information in
the designated region or position; a determining section for determining
an image region to be cut out from the image, based on the specified
object region; and a cutting section for cutting out the determined image
region from the image.
| Inventors: |
FUNAYAMA, RYUJI; (NARA-SHI, JP)
; TAKEZAWA, HAJIME; (YAMATOKORIYAMA-SHI, JP)
; KONYA, MINEHIRO; (DAITO-SHI, JP)
; HAKARIDANI, MITSUHIRO; (IKOMA-SHI, JP)
|
| Correspondence Address:
|
NIXON & VANDERHYE
1100 NORTH GLEBE ROAD
8TH FLOOR
ARLINGTON
VA
222014714
|
| Serial No.:
|
097919 |
| Series Code:
|
09
|
| Filed:
|
June 16, 1998 |
| Current U.S. Class: |
382/282 |
| Class at Publication: |
382/282 |
| International Class: |
G06K 009/20; G06K 009/36 |
Foreign Application Data
| Date | Code | Application Number |
| Jun 20, 1997 | JP | 9-163890 |
Claims
What is claimed is:
1. An image processing apparatus, comprising: a designating section for
designating an arbitrary region or an arbitrary position of an image; a
specifying section for specifying an object region which is present in
the designated region or position, and which can additionally be in a
vicinity of the designated region or position, from pixel information in
the designated region or position and pixel information which can
additionally be in the vicinity of the designated region; a determining
section for determining an image region to be cut out from the image,
based on the specified object region; and a cutting section for cutting
out the determined image region from the image.
2. An image processing apparatus according to claim 1, wherein the
determining section includes a section for adjusting a size of the image
region to a prescribed size.
3. An image processing apparatus according to claim 1, wherein the
determining section includes a correcting section for entirely correcting
the designated image region or correcting only a part of the designated
image region.
4. An image processing apparatus, comprising: a designating section for
designating an arbitrary region or an arbitrary position of an image; an
analyzing section for analyzing a color distribution in the designated
region or position and in a vicinity of the designated region or
position; an adjusting section for adjusting a condition for specifying a
face image which is present in the image, according to a result of the
analysis; a specifying section for specifying a face image region which
is present in the designated region or position, and which can
additionally be in the vicinity of the designated region or position,
based on the adjusted condition; a determining section for determining an
image region to be cut out from the image, based on the specified face
image region; and a cutting section for cutting out the determined image
region from the image.
5. An image processing apparatus according to claim 4, wherein the
determining section includes a section for adjusting a size of the image
region, using the region or the position designated by the designating
section as a reference.
6. An image processing apparatus according to claim 4, wherein the
specifying section includes a section for applying noise elimination or
labelling to the specified face image region to produce a face mask; a
section for vertically scanning the produced face mask to obtain a sum of
vertical differential luminance values of pixels in the image
corresponding to the face mask to produce a histogram; and a section for
detecting a central axis of a face from a profile of the produced
histogram.
7. An image processing apparatus according to claim 4, wherein the
specifying section includes a section for applying noise elimination or
labelling to the specified face image region to produce a face mask; a
section for vertically scanning the produced face mask to obtain a mean
luminance value of pixels in the image corresponding to the face mask to
produce a histogram; and a section for detecting a vertical nose position
from a profile of the produced histogram.
8. An image processing apparatus according to claim 4, wherein the
specifying section includes a section for applying noise elimination or
labelling to the specified face image region to produce a face mask; a
section for horizontally scanning the produced face mask to obtain a mean
luminance value of pixels in the image corresponding to the face mask to
produce a histogram; and a section for detecting a vertical eye position
from a profile of the produced histogram.
9. An image processing apparatus according to claim 4, wherein the
specifying section includes a section for applying noise elimination or
labelling to the specified face image region to produce a face mask; a
section for horizontally scanning the produced face mask to obtain a mean
luminance value of pixels in the image corresponding to the face mask to
produce a histogram; and a section for detecting a vertical mouth
position from a profile of the produced histogram.
10. An image processing apparatus according to claim 9, wherein the
specifying section further includes a section for detecting a vertical
eye position from the profile of the produced histogram; and a section
for obtaining a middle position of a region between the detected vertical
eye position and the detected vertical mouth position to detect a width
of the face mask at the middle position.
11. An image processing apparatus according to claim 4, wherein the
determining section includes a section for adjusting a position of the
image region, based on the face image region, a central axis of a face in
the face image, a vertical nose position of the face in the face image, a
vertical eye position of the face in the face image, a vertical mouth
position of the face in the face image, and a width of a face mask of the
face image.
12. An image processing apparatus according to claim 4, wherein the
determining section includes a section for adjusting a size of the image
region, based on the face image region, a central axis of a face in the
face image, a vertical nose position of the face in the face image, a
vertical eye position of the face in the face image, a vertical mouth
position of the face in the face image, and a width of a face mask of the
face image.
13. An image processing apparatus according to claim 4, wherein the
determining section includes a correcting section for entirely correcting
the designated image region or correcting only a part of the designated
image region.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] The present invention relates to an image processing apparatus for
use in computers for conducting image processing, word processors,
portable information
tools, copying machines, scanners, facsimiles or the
like. More specifically, the present invention relates to an image
processing apparatus enabling a user to designate the coordinates of any
point on the image by a coordinate input apparatus such as a mouse, a pen
or a tablet, or an image processing apparatus capable of
p
hotoelectrically converting a printed image on a piece of paper or the
like with coordinates being designated in a different type of ink so as
to input the image and the coordinates, wherein the image processing
apparatus being capable of cutting out an object image with an arbitrary
size at an arbitrary position from the original image.
[0003] 2. Description of the Related Art
[0004] When an image including an object or a person's face of interest is
cut out from the original image, the image is cut with a desired size
using a pair of scissors, a cutter or the like, in the case of a
p
hotograph. In the case of an electronic image obtained by a CCD camera
or a scanner, however, the positions of two points are designated by a
coordinate input device such as a mouse, using software for image
processing (e.g., the image processing software "PhotoShop" made by Adobe
Inc.), and a rectangle having a diagonal between the two points is
designated as a region.
[0005] In order to output a part of the original image, which includes an
object of interest, as an image having a particular size, a portion
having the object of interest at a well-balanced position is first cut
out from the original image, and thereafter, is magnified/reduced to a
required size. In the case of a photograph, such magnification/reduction
is conducted by, for example, a copying machine. In the case of an
electronic image, magnifying/reducing the image to a desired size can be
easily carried out. However, cutting out a portion having the object of
interest at a well-balanced position must be conducted before such
magnification/reduction.
[0006] Furthermore, in order to extract a region representing a person's
face except for hair (hereinafter, this portion is referred to as a "face
skin") from the original image, a face skin region which is visually
determined by an operator is painted out. In the case of an electronic
image, a pixel is designated by a coordinate input device such as a
mouse, and those pixels having a similar color to that of the designated
pixel are combined to be extracted as one region (e.g., "PhotoShop" as
mentioned above). There is also a method as follows: the color
distribution of a face skin is analyzed in advance to set a probability
density function. Then, the probability density of the input pixels is
obtained using values such as RGB (red, green, blue) values and HSV (hue,
color saturation, brightness) values as arguments, thereby designating
those pixels having a probability equal to or higher than a prescribed
value as a face-skin region (R. Funayama, N. Yokoya, H. Iwasa and H.
Takemura, "Facial Component Extraction by Cooperative Active Nets with
Global Constraints", Proceedings of 13th International Conference on
Pattern Recognition, Vol. 2, pp. 300-305, 1996).
[0007] Conventionally, in the case where a rectangle including a face-skin
region in the image is determined, the rectangle is commonly determined
visually by an operator.
[0008] Moreover, the central axis of a person's face has been commonly
detected based on the visual determination of an operator.
[0009] Another method for detecting the central axis of the face is as
follows: a skin-color portion of the face is extracted as a region, and
the region is projected to obtain a histogram. Then, the right and left
ends of the face are determined from the histogram, whereby the line
passing through the center thereof is determined as the central axis of
the face (Japanese Laid-Open Publication No. 7-181012).
[0010] Furthermore, respective vertical positions of the nose, the eyes
and the mouth on the face have been commonly detected based on the visual
determination of an operator.
[0011] Another method is to match an image template of the nose with an
input image (*Face Recognition: Features versus Templates*, by R.
Brunelli and T. Poggio, IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol.15, No.10, pp.1042-1052, 1993). In this article, a
method for detecting the vertical positions by projecting a gray-level
image or an edge image to obtain a histogram, and examining peaks and
valleys of the histogram, has also been proposed.
[0012] Moreover, the width of the face has been commonly detected based on
the visual determination of an operator.
[0013] Another method is as follows: a skin-color portion of the face is
extracted as a region, and the region is projected to obtain a histogram.
Then, the right and left ends of the face are determined from the
histogram, whereby the distance between the ends is obtained as the width
of the face (Japanese Laid-Open Publication No. 7-181012).
[0014] As described above, in order to output a part of the original
image, which includes a person's face of interest, as an image having a
particular size, a portion having the face at a well-balanced position is
first cut out from the original image, and thereafter, is
magnified/reduced to a required size. In the case of a p
hotograph, such
magnification/reduction is conducted by, for example, a copying machine.
In the case of an electronic image, magnifying/reducing the image to a
desired size can be carried out easily. However, cutting out a portion
having the object of interest at a well-balanced position must be
conducted before such magnification/reduction.
[0015] In the case of an electronic image, it is also possible for a user
to adjust, in advance, the size of the face of the original image to an
appropriate size, move a frame on the screen according to the visual
determination of the user so that the face is located in the center, and
output the image located within the frame. An apparatus achieving such an
operation has been proposed in Japanese Laid-Open Publication No.
64-82854.
[0016] In order to achieve improved visual recognition of a person's face
on a photograph or an image, the amount of exposure light for printing is
adjusted in the case of a photograph. For an electronic image, there is
software capable of conducting adjustment of contrast, tonality and
brightness, edge sharpening, blurring processing and the like (e.g.,
"PhotoShop" as mentioned above).
[0017] When an image including an object or a person's face of interest is
cut out from the original image, the image is cut with a desired size
using a pair of scissors, a cutter or the like, in the case of a
photograph. However, using a pair of scissors, a cutter or the like to
cut an image is actually time-consuming. Moreover, cutting a portion
including the object or the face of interest at a well-balanced position
requires much skill. When software for processing an electronic image
obtained by a CCD camera or converted by a scanner is utilized (e.g.,
"P
hotoShop" as mentioned above), the positions of two points are usually
designated by a coordinate input device such as a mouse, and a rectangle
having a diagonal between the two points is designated as a region. In
this case as well, cutting out a portion including an object or a face of
interest at a well-balanced position requires much skill. Furthermore, in
the case where an object or a face of interest is originally located at
the edge of the screen, and a portion including the object or the face at
a well-balanced position in the center is to be cut out from the image,
it is necessary to first cut out the portion from the original image, and
thereafter, move the position of the object or the face to the center of
the resultant image.
[0018] As described above, in order to output a part of the original
image, which includes an object of interest, as an image having a
particular size, a portion having the object of interest at a
well-balanced position is first cut out from the original image, and
thereafter, is magnified/reduced to a required size. In the case of a
photograph, such magnification/reduction is conducted by, for example, a
copying machine. However, the image is not always cut to the same size.
Therefore, in order to obtain an image with a desired size, a troublesome
operation of calculating the magnification/reduction ratio is required.
In the case of an electronic image, magnifying/reducing the image to a
desired size is easy. However, cutting out a portion having the object of
interest at a well-balanced position must be conducted before such
magnification/reduction. In short, at least two operations are required
to output an image having a particular size.
[0019] Furthermore, the above-mentioned method of painting out a visually
determined face-skin region is troublesome regardless of whether an image
to be processed is a photograph or an electronic image. Moreover,
painting a portion at the boundary between the face skin region and the
other regions must be conducted extremely carefully. In the case of an
electronic image, the above-mentioned method of combining those pixels
having similar color to that of the designated pixel to extract them as
one region (e.g. "PhotoShop" as mentioned above) has been used. In this
method, however, since the colors of the skin, the lip and the eyes are
different, it is necessary to combine the results of several operations
in order to extract the whole face-skin. Moreover, the skin color may be
significantly uneven even in the same person due to, for example,
different skin shades or shadows. In this case as well, the results of
several operations must be combined. Also described above is the method
of designating those pixels having a probability equal to or higher than
a prescribed value as a face-skin region (the above-cited reference by R.
Funayama, N. Yokoya, H. Iwasa and H. Takemura). According to this method,
however, a face-skin region might not be successfully extracted in the
case where the image's brightness is extremely uneven due to the
photographing conditions or the conditions at the time of obtaining the
image, or in the case where the color of the skin is different due to a
racial difference.
[0020] As described above, when a rectangle including a face-skin region
is to be obtained, the rectangle has been commonly determined visually by
an operator. However, such a method is troublesome regardless of whether
an image to be processed is a photograph or an electronic image.
[0021] Moreover, in the above-mentioned method of detecting the central
axis of a person's face from a histogram (Japanese Laid-Open Publication
No. 7-181012), the correct central axis can only be detected in the case
where the face is completely directed to the front, while the correct
central axis can not be obtained in the case where the face is turned
even slightly to either side.
[0022] Furthermore, according to the above-mentioned method of matching an
image template of the nose with an input image (the above-cited reference
by R. Brunelli and T. Poggio), it is desirable that the size of the nose
to be extracted is known. In the case where the size of the nose is not
known, templates of various sizes must be matched with the input image,
requiring substantial time for calculation. Moreover, according to the
above-mentioned method of detecting the vertical positions by examining
peaks and valleys of the histogram (the above-cited reference by R.
Brunelli and T. Poggio), the vertical positions might not be correctly
extracted, for example, in the case where the face skin region or the
background is not known. In short, wrong extraction could occur without
precondition.
[0023] Moreover, according to the above-mentioned method to detect a width
of the face (Japanese Laid-Open Publication No. 7-181012), a face skin
region should be correctly extracted based on the color information.
However, in the case where a background region includes a color similar
to that of the face skin, a region other than the face skin region might
be determined as a face skin, or a shaded portion in the face skin region
might not be determined as face skin. The detected width of the face
might be different depending upon whether or not the ears can be seen on
the image. Moreover, the detected width could be larger than the correct
width in the case where the face is turned toward either side.
[0024] As described above, in order to output a part of the original
image, which includes an object of interest, as an image having a
particular size, a portion having the object of interest at a
well-balanced position is first cut out from the image, and thereafter,
is magnified/reduced to a required size. In the case of a photograph,
such magnification/reduction is conducted by, for example, a copying
machine. However, the image is not always cut to the same size.
Therefore, in order to obtain an image with a desired size, a troublesome
operation of calculating the magnification/reduction ratio is required.
In the case of an electronic image, magnifying/reducing the image to a
desired size can be easily carried out. However, cutting out a portion
having the object of interest at a well-balanced position must be
conducted before such magnification/reduction. In short, at least two
operations are required to output an image having a particular size.
According to a somewhat automated method as described in Japanese
Laid-Open Publication No. 64-82854, the user adjusts, in advance, the
size of the face of the original image to an appropriate size, moves a
frame on the screen according to the visual determination of the user so
that the face is located in the center, and output the image located
within the frame. Alternatively, the user adjusts, in advance, the size
of the face of the original image to an appropriate size, moves a
T-shaped indicator on the screen according to the visual determination of
the user so that the ends of the horizontal line of the T-shaped
indicator overlap the eyes, respectively, and then, outputs an image
within a rectangle defined with an appropriate margin from the T-shaped
indicator.
[0025] The above-described operation of adjusting the amount of exposure
light for printing in order to achieve improved visual recognition of a
person's face on a photograph or an image requires much skill. For an
electronic image, there is software capable of conducting adjustment of
contrast, tonality and brightness, edge sharpening, blurring processing
and the like (e.g., "PhotoShop" as mentioned above), as described above.
In this case as well, using such software requires much skill, and
usually, various operations must be tried until a desired image is
obtained.
SUMMARY OF THE INVENTION
[0026] According to one aspect of the present invention, an image
processing apparatus includes a designating section for designating an
arbitrary region or an arbitrary position of an image; a specifying
section for specifying an object region which is present in the
designated region or position, and which can additionally be in a
vicinity of the designated region or position, from pixel information in
the designated region or position; a determining section for determining
an image region to be cut out from the image, based on the specified
object region; and a cutting section for cutting out the determined image
region from the image.
[0027] In one example, the determining section includes a section for
adjusting a size of the image region to a prescribed size.
[0028] In one example, the determining section includes a correcting
section for entirely correcting the designated image region or correcting
only a part of the designated image region.
[0029] According to another aspect of the present invention, an image
processing apparatus includes a designating section for designating an
arbitrary region or an arbitrary position of an image; an analyzing
section for analyzing a color distribution in the designated region or
position and in a vicinity of the designated region or position; a
adjusting section for adjusting a condition for specifying a face image
which is present in the image, according to a result of the analysis; a
specifying section for specifying a face image region which is present in
the designated region or position, and which can additionally be in the
vicinity of the designated region or position, based on the adjusted
condition; a determining section for determining an image region to be
cut out from the image, based on the specified face image region; and a
cutting section for cutting out the determined image region from the
image.
[0030] In one example, the determining section includes a section for
adjusting a size of the image region, using the region or the position
designated by the designating section as a reference.
[0031] In one example, the specifying section includes a section for
applying noise elimination or labelling to the specified face image
region to produce a face mask; a section for vertically scanning the
produced face mask to obtain a sum of vertical differential luminance
values of pixels in the image corresponding to the face mask to produce a
histogram; and a section for detecting a central axis of a face from a
profile of the produced histogram.
[0032] In one example, the specifying section includes a section for
applying noise elimination or labelling to the specified face image
region to produce a face mask; a section for vertically scanning the
produced face mask to obtain a mean luminance value of pixels in the
image corresponding to the face mask to produce a histogram; and a
section for detecting a vertical nose position from a profile of the
produced histogram.
[0033] In one example, the specifying section includes a section for
applying noise elimination or labelling to the specified face image
region to produce a face mask; a section for horizontally scanning the
produced face mask to obtain a mean luminance value of pixels in the
image corresponding to the face mask to produce a histogram; and a
section for detecting a vertical eye position from a profile of the
produced histogram.
[0034] In one example, the specifying section includes a section for
applying noise elimination or labelling to the specified face image
region to produce a face mask; a section for horizontally scanning the
produced face mask to obtain a mean luminance value of pixels in the
image corresponding to the face mask to produce a histogram; and a
section for detecting a vertical mouth position from a profile of the
produced histogram.
[0035] In one example, the specifying section further includes a section
for detecting a vertical eye position from the profile of the produced
histogram; and a section for obtaining a middle position of a region
between the detected vertical eye position and the detected vertical
mouth position to detect a width of the face mask at the middle position.
[0036] In one example, the determining section includes a section for
adjusting a position of the image region, based on the face image region,
a central axis of a face in the face image, a vertical nose position of
the face in the face image, a vertical eye position of the face in the
face image, a vertical mouth position of the face in the face image, and
a width of a face mask of the face image.
[0037] In one example, the determining section includes a section for
adjusting a size of the image region, based on the face image region, a
central axis of a face in the face image, a vertical nose position of the
face in the face image, a vertical eye position of the face in the face
image, a vertical mouth position of the face in the face image, and a
width of a face mask of the face image.
[0038] In one example, the determining section includes a correcting
section for entirely correcting the designated image region or correcting
only a part of the designated image region.
[0039] Thus, the invention described herein makes possible the advantage
of providing an image processing apparatus capable of photoelectrically
converting a printed image on a piece of paper or the like with
coordinates being designated in a different type of ink so as to input
the image and the coordinates, wherein the image processing apparatus
being capable of cutting out an object image with an arbitrary size at an
arbitrary position from the original image.
[0040] This and other advantages of the present invention will become
apparent to those skilled in the art upon reading and understanding the
following detailed description with reference to the accompanying
figures.
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] FIG. 1 is a block diagram showing an image processing apparatus
according to one example of the present invention;
[0042] FIG. 2 is a block diagram showing an image/coordinate input
apparatus in the image processing apparatus shown in FIG. 1;
[0043] FIG. 3 is a block diagram showing another image/coordinate input
apparatus in the image processing apparatus shown in FIG. 1;
[0044] FIG. 4 shows examples of a region of an object or a face in the
image designated by the user;
[0045] FIG. 5 shows examples of a position of the object or a face in the
image designated by the user;
[0046] FIGS. 6A through 6D show images illustrating the steps from the
user's designation to the extraction of an image;
[0047] FIG. 7 is a flow chart illustrating Image processing procedure 1
conducted by the image processing apparatus of the example shown in FIG.
1;
[0048] FIG. 8 is a diagram showing the pixels of an object region;
[0049] FIG. 9 is a diagram illustrating how a part of an image is attached
to a document;
[0050] FIG. 10 is a diagram illustrating an example of extracting only a
face-skin portion from the image including a person's face;
[0051] FIGS. 11A, 11B and 11C show the frequency histograms plotted with
respect to the hue, color saturation and brightness, respectively;
[0052] FIG. 12 is a flow chart illustrating Image processing procedure 3
for producing an image representing a face skin region;
[0053] FIGS. 13A and 13B show input patterns designated by the user;
[0054] FIG. 14A shows an example of the image;
[0055] FIG. 14B is a graph showing the relationship between brightness and
frequency of the image of FIG. 14A;
[0056] FIG. 15 is a diagram illustrating an example of extracting only a
face skin portion from the image including a person's face;
[0057] FIG. 16 is a diagram illustrating how the size of a window region
is gradually increased;
[0058] FIG. 17 shows an input pattern designated by the user;
[0059] FIG. 18 is a flow chart illustrating Image processing procedure 5
for producing a face mask by the image processing apparatus of the
example shown in FIG. 1;
[0060] FIG. 19 illustrates how the face mask is produced;
[0061] FIG. 20 is a flow chart illustrating the process for detecting the
central axis of the face;
[0062] FIG. 21 is a diagram illustrating the process for detecting the
central axis of the face;
[0063] FIG. 22 is a flow chart illustrating Image processing procedure 6
for detecting a vertical position of the nose by the image processing
apparatus of the example shown in FIG. 1;
[0064] FIG. 23 is a diagram illustrating the process for detecting the
vertical position of the nose;
[0065] FIG. 24 is a flow chart illustrating Image processing procedure 7
for detecting a vertical position of the eyes by the image processing
apparatus of the example shown in FIG. 1;
[0066] FIG. 25 is a diagram illustrating the process for detecting the
vertical position of the eyes;
[0067] FIG. 26 is a flow chart illustrating Image processing procedure 8
for detecting a vertical position of the mouth by the image processing
apparatus of the example shown in FIG. 1;
[0068] FIG. 27 is a diagram illustrating the process for detecting the
vertical position of the mouth;
[0069] FIG. 28 is a flow chart illustrating Image processing procedure 9
for detecting a width of a face mask by the image processing apparatus of
the example shown in FIG. 1;
[0070] FIG. 29 is a diagram illustrating the process for detecting the
width of the face mask;
[0071] FIG. 30 is a flow chart illustrating Image processing procedure 10
for cutting out a rectangular image from the original image by the image
processing apparatus of the example shown in FIG. 1;
[0072] FIG. 31 shows a sheet of an address book with a face image being
attached thereto; and
[0073] FIG. 32 is a diagram illustrating the process for correcting an
image.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0074] Hereinafter, the present invention will be described by way of
illustrative examples with reference to the accompanying drawings. The
same reference numerals designate the same component.
[0075] FIG. 1 is a block diagram showing an image processing apparatus
according to one example of the present invention. An image to be
processed and coordinates required for the processing are input by an
image/coordinate input apparatus 1-1. In the case where the image is in a
digital form, the image is directly stored in an input image storing
section 1-2-1 of a storage apparatus 1-2. In the case where the input
image is an analog form, the image is converted into a digital form, and
the resultant image is stored in the input image storing section 1-2-1.
The input coordinates are stored in an input coordinate storing section
1-2-2. An image processing section 1-3 uses the stored image and
coordinates as input information to conduct an appropriate image
processing in an operation region of a memory within the image processing
section 1-3. Thereafter, the image processing section 1-3 stores the
resultant image and coordinates in an output image storing section 1-2-3
and an output coordinate storing section 1-2-4 of the storage apparatus
1-2, respectively. After undergoing processing, the resultant image can
be sent to an image output apparatus 1-4, whereby a copy of the resultant
image can be made.
[0076] FIGS. 2 and 3 are diagrams illustrating in detail the
image/coordinate input apparatus 1-1 shown in FIG. 1.
[0077] The image/coordinate input apparatus 1-1 in FIG. 1 separately
includes an image input apparatus 2-1 and a coordinate input apparatus
2-2, as shown in FIG. 2. The input image from the image input apparatus
2-1 is stored in the input image storing section 1-2-1 of the storage
apparatus 1-2, whereas the input coordinates from the coordinate input
apparatus 2-2 are stored in the input coordinate storing section 1-2-2 of
the storage apparatus 1-2. For example, a camera capable of directly
inputting a digitized image by a solid-state image sensing device (CCD;
charge coupled device); an apparatus capable of digitizing a photograph
or a scanner which can input printed matters; or an apparatus for holding
a digitized image such as equipment connected to a network, like the
internet, and a magnetic storage apparatus may be used as the image input
apparatus 2-1. As the coordinate input apparatus 2-2, a mouse capable of
inputting coordinates with a pointer displayed on a display, a track
ball, a pen-type apparatus, a pen-type coordinate input apparatus using a
tablet, a coordinate input apparatus using a finger, or the like may be
used.
[0078] The image/coordinate input apparatus 1-1 in FIG. 1 includes an
image reading apparatus 2-3 and an image/coordinate separation apparatus
2-4, as shown in FIG. 3. This type of the image/coordinate input
apparatus 1-1 is used in the case where both an image including an object
to be processed and input coordinates are present on a single image. For
example, in the case where a line or a point representing the coordinates
is drawn in a particular color on a photograph, only a component of that
color is extracted to obtain a separate image. Thereafter, the position
of the point or the line is analyzed from the separate image, whereby the
coordinates are extracted.
[0079] FIG. 4 shows examples of a region of the object in the image
designated by the user. First, an image and a pattern indicated by a
solid line or points, as shown in FIG. 4, are input to the
image/coordinate input apparatus 1-1 (FIG. 1). In the case of a
rectangular pattern 4-1, the coordinates of two points, that is, the
coordinates of the upper left point and the lower right point of the
pattern are used as the input coordinates. In the case of a pattern 4-4,
4-10, 4-11, 4-12, 4-13 or 4-14, the coordinates of the upper left point
and the lower right point of a rectangle circumscribing the input pattern
(i.e., such a rectangle as shown by a dotted line on each image) are used
as the input coordinates. In the case of the other patterns, two
coordinates defining a rectangle circumscribing the input pattern can be
used as the input coordinates. However, in the case of a line or dot
pattern, that is, in the case of a pattern 4-2, 4-3, 4-5 or 4-6, no
rectangle circumscribing the pattern could be obtained. Otherwise, such a
rectangle that has an extremely large aspect ratio would be obtained. In
such a case, an appropriate rectangle will be set according to a mean
aspect ratio of the object (this rectangle will be set as a square when
the object is not known). In the case of a pattern 4-2, for example, the
object is a person's face and a rectangle circumscribing the input
pattern is extremely long in the longitudinal direction (or the input
pattern is a vertical straight line and no rectangle circumscribing the
input pattern can be obtained). In such a case, a rectangle as shown by a
dotted line is set. In other words, a rectangle horizontally
magnified/reduced from the rectangle circumscribing the input pattern is
obtained by multiplying the length of the rectangle circumscribing the
input pattern by a prescribed ratio. Furthermore, the coordinates of the
upper left point and the lower right point are used as the input
coordinates. In the case of a pattern 4-7, 4-8 or 4-9, a rectangle
longitudinally and laterally magnified from the rectangle circumscribing
the input pattern by respective prescribed ratios is set, and the
coordinates of two points of the rectangle are used as the input
coordinates.
[0080] FIG. 5 shows examples of a position of the object designated by the
user. In the case where the user designates a point such as a pattern
5-1, the coordinates of that point can be used as the input coordinates.
In the case where the user designates a pattern other than the point such
as a pattern 5-2, the center of a circumscribed rectangle can be used as
the input coordinates.
[0081] Image Processing Procedure 1
[0082] Image processing procedure 1 conducted by the image processing
apparatus of the present example will now be described with reference to
the flow chart of FIG. 7.
[0083] First, using the image/coordinate input apparatus 1-1 (FIG. 1), the
user roughly designates a region of the object in the image stored in the
input image storing section 1-2-1, as shown in FIG. 4, or roughly
designates a position of the object, as shown in FIG. 5. FIGS. 6A through
6D show images illustrating the steps from the user's designation to the
extraction of an image. When a region 6-1-1 is designated by the user
(Step S1-1), as shown in FIG. 6A, the image processing section 1-3
obtains a rectangular region 6-1-2 reduced from the rectangle
circumscribing the input pattern by an appropriate ratio, and stores the
region 6-1-2 as a set region in the input coordinate storing section
1-2-2 (Step S1-7). As shown in FIG. 6B, when a position 6-2-1 is
designated by the user (Step S1-2), the image processing section 1-3
obtains an appropriate rectangular region 6-2-2 centered around the
designated position 6-2-1 (Step S1-3), and stores the region 6-2-2 in the
input coordinate storing section 1-2-2 (Step S1-7).
[0084] The image processing section 1-3 (FIG. 1) utilizes the operation
region of the memory within the image processing section 1-3 to store the
color information of the pixels included in the rectangular region 6-1-2
or 6-2-2 (Step S1-4), and sets the rectangular region 6-1-2 or 6-2-2 as
an initial value of the object region (Step S1-5).
[0085] FIG. 8 shows the pixels in the object region. The image processing
section 1-3 finds a pixel 8-2 adjacent to the object region 8-1. When the
pixel 8-2 satisfies at least one of the following two conditions (Step
S1-6), the pixel 8-2 is added to the object region (Step S1-9):
[0086] 1. the color difference between the pixel of interest and an
adjacent pixel in the object region is within a prescribed range; and/or
[0087] 2. the color difference between the pixel of interest and a pixel
stored in Step S1-4 is within a prescribed range.
[0088] The image processing section 1-3 examines all of the pixels
adjacent to the object region in terms of the above two conditions. This
operation is repeated until no pixel can be added to the object region.
Then, as shown in FIG. 6C, the image processing section 1-3 obtains a
final object region 6-3-1 (Step S1-8). It should be noted that, although
various indices of the color difference have been proposed, a Godlove's
color-difference formula as shown in "Improved Color-Difference Formula
with Applications to the Perceptibility and Acceptability of Fadings", I.
H. Godlove, J. Opt. Soc. Am., 41, 11, pp. 760-772, 1951 may be used.
[0089] The image processing section 1-3 expresses the area of the object
region 6-3-1 as the number of pixels included in the object region 6-3-1.
Then, as shown in FIG. 6D, the image processing section 1-3 obtains a
rectangular region 6-3-3 centered around the center of gravity of the
object region 6-3-1 and having an area corresponding to a prescribed
percentage (e.g., 30%) of the total area of the rectangular region 6-3-3.
Thereafter, the image processing section 1-3 cuts out the rectangular
region 6-3-3 from the original image. The shape of the rectangular region
6-3-3 may be square. Alternatively, the shape of the rectangular region
6-3-3 may be set as appropriate depending upon applications. For example,
the rectangular region 6-3-3 may be set to have a ratio of 4:3 according
to the aspect ratio of a television screen, or may be set to have a ratio
of 16:9 according to the aspect ratio of a high-definition television
screen. It should be noted that, although the rectangular region is
centered around the center of gravity of the object region in the above
description, the position of the center of gravity in the rectangular
region may be shifted longitudinally and laterally depending upon the
application.
[0090] A method for obtaining the center of gravity is described in, for
example, "Robot Vision" by M. Yachida, Shohkohdo, ISBN4-7856-3074-4
C3355, 1990. A part of the image can be cut out from the original image,
based on the coordinates of the rectangular region.
[0091] Image Processing Procedure 2
[0092] The image processing section 1-3 magnifies or reduces the image
which has been cut out according to Image processing procedure 1, to an
appropriate size, and stores the resultant image in the output image
storing section 1-2-3 of the storage apparatus 1-2. The image processing
section 1-3 may utilize the stored image for any appropriate
applications. For example, an image 9-1 including an automobile and
obtained by a digital camera, as shown in FIG. 9, is stored in the input
image storing section 1-2-1. Then, a part of the image including only the
automobile is cut out from the input image. Thereafter, this part of the
image is attached to a report 9-2 having a prescribed format and a frame
for a prescribed image size. The resultant report 9-2 is stored in the
output image storing section 1-2-3.
[0093] Image Processing Procedure 3
[0094] Before Image processing procedure 3, the color distribution of a
person's face skin is analyzed in advance according to the following
procedures:
[0095] 1. the face-skin portion is manually extracted from a face image
10-1 to produce a face-skin image 10-2 (FIG. 10);
[0096] 2. a face-skin image is similarly produced for a plurality of
different persons;
[0097] 3. frequency histograms are plotted with respect to the hue (FIG.
11A, 11-1-1), color saturation (FIG. 11B, 11-2-1) and brightness (FIG.
11C, 11-3-1) of the pixels of each of the face-skin images to obtain the
color distribution; and
[0098] 4. for each histogram, the mean and variance of the distribution
are obtained, and such a normal probability density function (11-1-2,
11-2-2, 11-3-2) that best fits the distribution is obtained.
[0099] Thus, the color distribution of the face skin can be expressed by
the normal probability density functions (P.sub.hue(hue), P.sub.sat(sat)
and P.sub.val(val)) of the hue, color saturation and brightness, each
function having two arguments: the mean and variance (.mu..sub.hue,
.sigma..sup.2.sub.hue; .mu..sub.sat, .sigma..sup.2.sub.sat; and
.mu..sub.val, .sigma..sup.2.sub.val, respectively). In this
specification, each of the normal probability density functions is
referred to as a skin-region probability density function. Each
skin-region probability density function is expressed by the following
expressions:
P.sub.hue(hue).about.N(.mu..sub.hue, .sigma..sup.2.sub.hue) (1)
P.sub.sat(sat).about.N(.mu..sub.sat, .sigma..sup.2.sub.sat) (2)
P.sub.val(val).about.N(.mu..sub.val, .sigma..sup.2.sub.val) (3)
[0100] When the calculated mean and variance are applied to the normal
distribution, those values which are significantly different from a mean
value, if any, would result in a greater estimation of the variance than
the actual variance. Even a few values would cause such an estimation.
For example, in the case of the hue distribution histogram as shown in
FIG. 11A, most of the pixels are distributed within about .+-.30 of about
20. In this histogram, values such as 100 and -150 would result in a
grater estimation of the variance. Therefore, in order to obtain a normal
distribution curve (a probability density function) which can be applied
to a more accurate distribution, it would be better to first remove those
pixels having such values, and thereafter, calculate the mean and
variance.
[0101] The image processing section 1-3 stores each of the normal
probability density functions in advance, and processes the image stored
in the input image storing section 1-2-1 according to the flow chart of
FIG. 12. In Step S1-0, the image processing section 1-3 sets an original
processing region, based on the user input. In the case where a pattern
(region) 9-1 as shown in FIG. 13A is input from the image/coordinate
input apparatus 1-1 to the input coordinate storing section 1-2-2, the
image processing section 1-3 sets a processing region 9-2 of the image
stored in the input image storing section 1-2-1 in such a way as
described above. In the case where a pattern (position) 9-4 as shown in
FIG. 13B is input, the image processing section 1-3 sets a processing
region 9-5 (Step S1-0). The image processing section 1-3 substitutes a
hue value, a color-saturation value and a brightness value of each pixel
in the respective normal probability density functions obtained as
described above, so as to obtain the respective probabilities. Such a
pixel that has a value equal to or higher than a prescribed probability
with respect to each of the hue, color saturation and brightness is
determined as an original probable face-skin pixel (Step S2-1). At this
time, the prescribed probability should be set to a small value such as
5% so that as many pixels as possible may be selected as a probable
face-skin pixel. Thus, any pixels which possibly correspond to the
face-skin portion are determined as original probable face-skin pixels.
Thereafter, the image processing section 1-3 calculates the mean and
variance of each of the hue, color saturation and brightness (Step S2-2).
In the foregoing description, an original probable face-skin pixel is
selected based on the probabilities of the hue, color saturation and
brightness. However, it may also be effective to adjust each threshold to
a value close to the pixel value of the above-mentioned prescribed
probability, depending upon the characteristics of an imaging system.
[0102] Provided that the mean and variance of the hue, color distribution
and brightness thus calculated are .mu.'.sub.hue, .sigma..sup.2'.sub.hue;
.mu.'.sub.sat, .sigma..sup.2'.sub.sat; and .mu.'.sub.val,
.sigma..sup.2'.sub.val, respectively, corresponding probability density
functions P'.sub.hue(hue), P'.sub.sat(sat) and P'.sub.val(val) having
these arguments can be expressed by the following expressions:
P'.sub.hue(hue).about.N(.mu.'.sub.hue, .sigma..sup.2'.sub.hue) (4)
P'.sub.sat(sat).about.N(.mu.'.sub.sat, .sigma..sup.2'.sub.sat) (5)
[0103] P'.sub.val(val).about.N(.mu.'.sub.val, .sigma..sup.2'.sub.val)
(6)
[0104]
[0105] Using these probability density function, the image processing
section 1-3 selects face-skin pixels according to the following
procedures:
[0106] 1. first, all of the pixels in the image are set as initial values,
and any pixels having a value equal to or lower than a prescribed
probability (P'.sub.hue(hue)) calculated from a hue value as an argument
are removed (Step S2-3);
[0107] 2. next, any pixels having a value equal to or lower than a
prescribed probability (P'.sub.sat(sat)) calculated from a
color-saturation value as an argument are removed (Step S2-4); and
[0108] 3. finally, any pixels having a value equal to or lower than a
prescribed probability (P'.sub.val(val)) calculated from a brightness
value as an argument are removed (Step S2-5).
[0109] As a result, a face-skin region is specified (Step S2-6).
[0110] The lower limit of each probability is set higher than they were
set when the original probable face-skin pixels were obtained. For
example, provided that the previous threshold of the probability is 5% as
described above, the threshold may be set to 30%. As a result, more
accurate extraction can be carried out. More specifically, any pixels
that have been wrongly extracted as not being noise based on the 5%
threshold, would be removed based on the 30% threshold.
[0111] In the foregoing description, selection of the pixels corresponding
to the face-skin portion is conducted based on the probabilities.
However, it may also be effective to adjust each threshold to a value
close to the pixel value of the above-mentioned prescribed probability,
depending upon the characteristics of an imaging system. For example, as
can be seen from FIG. 14A, the face skin and the hair of an image 14-1
have different brightnesses. FIG. 14B is a histogram showing the
brightness versus frequency of the image of FIG. 14A. As shown in FIG.
14B, a peak 14-2 representing the hair appears at a lower value of the
brightness, whereas a peak 14-3 representing the face-skin region appears
at a relatively higher value of the brightness. Provided that a peak
value is simply selected as a threshold of the brightness probability of
the image 14-1, the peak value 14-2 might be set as a threshold, whereby
those pixels corresponding to a part of the hair might be selected as the
pixels corresponding to the face skin. In such a case, such an algorithm
as an Ohtsu's discriminant analysis method (which is described in the
above-cited reference: "Robot Vision" by M. Yachida) may be applied to a
value equal to or lower than an appropriate brightness value to set a
more appropriate value 14-5 as the brightness threshold.
[0112] By updating the skin region probability density functions as
appropriate in such a manner as described above, an image 12-3
representing a face-skin region can be obtained from an image 12-1, as
shown in FIG. 15 (Step S2-6). The image 12-3 thus obtained has a smaller
amount of noise, as compared to an image 12-2 conventionally extracted
using a fixed function.
[0113] Image Processing Procedure 4
[0114] Image processing procedure 4 is conducted after the image
representing the face skin-region is obtained according to Image
processing procedure 3. Referring to an image 16-1 in FIG. 16, in the
case where only a position 16-1-0 is designated by the user, the image
processing section 1-3 sets the smallest rectangle 16-1-1 centered around
the designated point, and sets a region 16-1-3 located between the
rectangle 16-1-1 and a slightly larger rectangle 16-1-2 as an initial
window region. The image processing section 1-3 gradually magnifies the
window region 16-1-3 as shown in images 16-2 and 16-3, until one of the
four sides of the outer rectangle 16-1-2 of the window region 16-1-3
reaches the edge of the input image. Thereafter, the image processing
section 1-3 calculates the dispersion of the pixels of the window region
16-1-3 in the image representing the face-skin region. The largest
dispersion will be calculated when both the face skin and the contour of
a part other than the face skin appear in the window region as shown in
an image 16-4. Accordingly, during the operation of gradually magnifying
the window region 16-1-3, the image processing section 1-3 determines the
outer rectangle 16-1-2 corresponding to the largest dispersion, as a
rectangle including the face skin region.
[0115] As shown in FIG. 17, in the case where a region 15-1, not a
position, is designated by the user, the image processing section 1-3
magnifies or reduces an outer rectangular defining a window region by an
appropriate ratio to the size smaller than that of a rectangle 15-2
obtained from the designated region 15-1. Thus, the smallest rectangle
15-3 is obtained, whereby an initial window region is set such that the
outer rectangle defining the window region corresponds to the rectangle
15-3. Thereafter, the image processing section 1-3 gradually magnifies
the window region, until an inner rectangle of the window region becomes
lager than a rectangle 15-4 magnified by an appropriate ratio from the
rectangle 15-2. The image processing section 1-3 then calculates the
dispersion of the pixels within the window region in a similar manner,
and determines the outer rectangle corresponding to the largest
dispersion, as a rectangle including the face-skin region. It should be
noted that, provided that the region designated by the user is only
slightly shifted from the face region, the rectangle magnified by an
appropriate ratio from the rectangle obtained from the designated region
may be determined as the rectangle including the face-skin region.
[0116] Image Processing Procedure 5
[0117] FIG. 18 is a flow chart showing Image processing procedure 5
conducted by the image processing section 1-3. The image processing
section 1-3 processes an input color image 17-1 shown in FIG. 19
according to Image processing procedure 4 to obtain a rectangle including
a face-skin region 17-2. The image processing section 1-3 processes that
rectangle according to Image processing procedure 3 to obtain an image
17-3 representing a face skin region as shown in FIG. 19. The image
processing section 1-3 combines the pixels connected to each other in the
face-skin region image 17-3 to produce a label image. The image
processing section 1-3 then extracts only a label region having the
largest area from the produced label image, and forms a binary image 17-4
from the label region (Step S3-1). Regarding the image 17-4, the image
processing section 1-3 replaces black pixels (holes) surrounded by white
pixels with white pixels to fill the holes. As a result, an image 17-5 is
formed (Step S3-2). The image processing section 1-3 first reduces the
size of the image 17-5 once (Step S3-3), and again produces a label
image. The image processing section 1-3 extracts only a label region
having the largest area from the label image (Step S3-4). After
magnifying the resultant image n times (Step S3-5), the image processing
section 1-3 reduces the size of the image n times (Step S3-6), and
extracts only a label region having the largest area from the resultant
label image (Step S3-7). Thus, a face mask 17-6 is obtained.
[0118] In the above steps, n should be set to, for example, 3 or 4
depending upon the size, characteristics or the like of the image. The
magnifying and reducing processing as described above is described in the
above-cited reference: "Robot Vision" by M. Yachida.
[0119] The face mask 17-6 thus obtained is used to define the range to be
subjected to the processing according to the flow chart shown in FIG. 20.
The image processing section 1-3 extracts only luminance components from
the input color image 17-1 to obtain a gray-level image 17-2 (Step S4-1).
At the same time, the image processing section 1-3 produces the face mask
17-6 according to the flow chart in FIG. 18 (Step S3-0). The image
processing section 1-3 differentiates the gray-level image 17-2 in a
vertical direction with respect to the white pixels in the face mask 17-6
to obtain a differentiated image 17-7 (Step S4-2). In the image 17-7,
those pixels corresponding to the black pixels in the face mask 17-6 are
set to zero. Such a differentiated image is commonly obtained by using,
for example, a Prewitt's operator (the above-cited reference: "Robot
Vision" by M. Yachida).
[0120] The image processing section 1-3 projects the image 17-7 in a
vertical direction to obtain a histogram 17-8 (Step S4-3). A vertical
axis of the histogram 17-8 shows the sum of the pixel values of the image
17-7 at a corresponding horizontal position. Referring to FIG. 21, the
image processing section 1-3 sets such a vertical axis 21-1a that
horizontally divides the histogram 21-1 into two regions: right and left
regions. The image processing section 1-3 obtains such an axis 21-2 that
has the smallest value of SSDS given by the following expression: 1 SSDS
= i = 1 ( a - i ) > a min and ( a + i )
< a max { ( f ( a - i ) - f ( a + i ) ) 2 }
[0121] where a indicates a position of the axis 21-1a, a.sub.min indicates
a left end of the histogram, a.sub.max indicates a right end of the
histogram, and f(s) indicates a height of the histogram (Step S4-4).
Then, the image processing section 1-3 sets the position 21-2 as a
central axis 21-3 of the face.
[0122] Image Processing Procedure 6
[0123] FIG. 22 is a flow chart illustrating Image processing procedure 6
performed by the image processing section 1-3. The image processing
section 1-3 produces the gray-level image 17-2 and the face mask 17-6
based on the image 17-1 as shown in FIG. 23 (Steps S4-1 and S3-0). The
image processing section 1-3 horizontally scans only the gray-level image
within the face mask 17-6 to produce a histogram 18-1 projecting a mean
luminance value (Step S5-1). The image processing section 1-3 then
produces a histogram 18-2 having a reduced resolution from the histogram
18-1 (Step S5-2), and searches for a peak position 18-2-1 approximately
in the middle of the lower-resolution histogram 18-2 (Step S5-3). In the
case where no peak is found (Step S5-6, No), the image processing section
1-3 sets the position in the middle of the histogram as a vertical nose
position (Step S5-5). In the case where any peak is found (Step S5-6,
Yes), the image processing section 1-3 scans a region around the position
of the histogram 18-1 corresponding to the detected peak of the
lower-resolution histogram 18-2, in order to search for a peak position
18-3-1 (Step S5-4). The image processing section 1-3 sets this peak
position 18-3-1 as the vertical nose position (Step S5-0).
[0124] Image Processing Procedure 7
[0125] FIG. 24 is a flow chart illustrating Image processing procedure 7
conducted by the image processing section 1-3. The image processing
section 1-3 produces a horizontal histogram 25-5 as shown in FIG. 25
according to Image processing procedure 6 (Step S5-10). Using this
histogram 25-5, the image processing section 1-3 scans a region 25-1
above a vertical nose position 25-6 detected in Image processing
procedure 6 to detect the deepest two valleys 25-2 and 25-3 (Step S6-1).
In the case where the two valleys are both detected (Step S6-3), the
image processing section 1-3 sets the lower one of the valleys, that is,
the valley 25-3 as a vertical position 25-7 of the eyes (Step S6-2). In
the case where only one valley is detected (Step S6-4), the image
processing section 1-3 sets the detected valley as the vertical eye
position (Step S6-5). In the case where no valley is detected, the image
processing section 1-3 sets the position in the middle of the region
between the vertical nose position and the upper end of the histogram
25-5 as the vertical eye position (Step S6-6).
[0126] Image Processing Procedure 8
[0127] FIG. 26 is a flow chart illustrating Image processing procedure 8
conducted by the image processing section 1-3. The image processing
section 1-3 produces a horizontal histogram 26-1 as shown in FIG. 27
according to Image processing procedure 6 (Step S5-10). Using the
histogram 26-1, the image processing section 1-3 scans a region 26-3
below the vertical nose position 26-2 detected in Image processing
procedure 6 to detect the deepest three valleys 26-4, 26-5 and 26-6 (Step
S7-1). In the case where the three valleys are detected (Step S7-2), the
image processing section 1-3 sets the middle one of the valleys, that is,
the valley 26-5 as a vertical position 26-7 of the mouth (Step S7-5), as
shown in an image 26-8.
[0128] In the case where only two valleys are detected (Step S7-3), the
image processing section 1-3 first detects the widths of a face mask
26-11 at the two valleys. Then, the image processing section 1-3
calculates the ratio of the width 26-10 of the face mask 26-11 at the
lower valley to the width 26-9 at the upper valley. In the case where the
calculated ratio is higher than a prescribed value (e.g., 0.7) (Step
S7-6), the image processing section 1-3 sets the position of the upper
valley as a vertical mouth position (Step S7-9). Otherwise, the image
processing section 1-3 sets the position of the lower valley as the
vertical mouth position (Step S7-10).
[0129] In the case where only one valley is detected (Step S7-4), the
image processing section 1-3 sets the position of the detected valley as
the vertical mouth position (Step S7-7).
[0130] In the case where no valley is detected, the image processing
section 1-3 sets the position in the middle of the region between the
vertical nose position and the lower end of the histogram 26-1 as the
vertical mouth position (Step S7-8).
[0131] Image Processing Procedure 9
[0132] FIG. 28 is a flow chart illustrating Image processing procedure 9
conducted by the image processing section 1-3. As shown in FIG. 29, a
face mask 28-1, a vertical eye position 28-2 and a vertical mouth
position 28-3 are obtained according to Image processing procedures 7 and
8 (Steps S3-0, S6-0 and S7-0). The image processing section 1-3
horizontally scans the pixels from the vertical eye position 28-2 to the
vertical mouth position 28-3 in order to obtain a width of the face mask
28-1. The image processing section 1-3 obtains a width in the middle of
the region between the vertical positions 28-2 and 28-3 as a width 28-4
of the face (Step S29-1).
[0133] Image Processing Procedure 10
[0134] FIG. 30 is a flow chart illustrating Image processing procedure 10
conducted by the image processing section 3-1. the face mask, the central
axis of the face, the vertical eye position, the vertical mouth position,
and the width of the face are detected according to Image processing
procedures 5, 6, 7, 8 and 9. The distance between the eyes and the mouth
can be obtained from the vertical eye position and the vertical mouth
position. Using such information, the image processing section 1-3 cuts
out an image which includes a face having an appropriate size and located
at a well-balanced position in the horizontal and vertical directions,
from the original image.
[0135] First, the image processing section 1-3 determines whether or not
the detected width of the face is reliable. The width of the face is
detected according to Image processing procedure 9, and the central axis
of the face is detected according to Image processing procedure 5.
Accordingly, the width of the face is divided into two widths by the
central axis. A width on the left side of the central axis is herein
referred to as a left-face width, whereas a width on the right side of
the central axis is herein referred to as a right-face width. The image
processing section 1-3 verifies that the left-face width and the
right-face width are not zero (Step S10-1). Then, the image processing
section 1-3 calculates the ratio of the left-face width to the right-face
width to determine whether or not the calculated ratio is within a
prescribed threshold-range (Step S10-2). In the case where the ratio is
not within the threshold-range (Step S10-2, Yes), the image processing
section 1-3 determines that the detected width of the face is not
reliable, and determines a rectangle to be cut out from the detected
eye-mouth distance (Step S10-6). More specifically, the image processing
section 1-3 sets the intersection of the central axis of the face and the
vertical nose position as a reference point. Then, the image processing
section 1-3 calculates a rectangle centered around the reference point
and having a width and length each calculated as a product of the
eye-mouth distance and a respective prescribed ratio (Step S10-6). Thus,
the rectangle to be cut out is obtained.
[0136] In the case where the width of the face is reliable (Step S10-2,
No), the image processing section 1-3 determines whether or not the
detected eye-mouth distance is reliable (Step S10-3). The image
processing section 1-3 calculates the ratio of the detected eye-mouth
distance to the length of the detected rectangle circumscribing a pattern
designated by the user, and determines whether or not the calculated
ratio is within a prescribed threshold-range (Step S10-3, No). Note that
in the case where a position, not a region, is designated by the user,
the image processing section 1-3 calculates the ratio of the detected
eye-mouth distance to a rectangle reduced by a prescribed ratio from the
face-skin region obtained according to Image processing procedure 4. In
the case where the ratio is not within the threshold-range, the image
processing section 1-3 determines that the detected vertical eye position
and the detected vertical mouth position (and the detected eye-mouth
distance) are not reliable, and determines a rectangle to be cut out from
the detected width of the face. More specifically, the image processing
section 1-3 sets as a reference point the intersection of the detected
central axis of the face and the vertical center line of the rectangle
circumscribing the pattern designated by the user. Then, the image
processing section 1-3 calculates a rectangle centered around the
reference point and having a width and length each calculated as a
product of the width of the face and a respective prescribed ratio. Thus,
the rectangle to be cut out is obtained (Step S10-5).
[0137] In the case where both the width of the face and the eye-mouth
distance are reliable (Step S10-3, Yes), the image processing section 1-3
determines a rectangle to be cut out from these two values. More
specifically, the image processing section 1-3 sets the intersection of
the detected central axis of the face and the vertical nose position as a
reference point, and calculates weighted arithmetic mean values by
respectively multiplying the width of the face and the eye-mouth distance
by a prescribed ratio. Then, the image processing section 1-3 calculates
a rectangle centered around the reference point and having a width and
length each calculated as a product of the respective calculated
arithmetic mean value and a respective prescribed ratio (Step S10-4).
Thus, a rectangle to be cut out is obtained.
[0138] Finally, the image processing section 1-3 calculates the ratio of
the size of the rectangle thus obtained to the size of the rectangle
circumscribing the pattern designated by the user, and determines whether
or not the calculated ratio is within a prescribed threshold-range (Step
S10-7). In the case where the ratio is not within the threshold-range,
the image processing section 1-3 determines that the obtained rectangle
is not appropriate, and determines a rectangle from the pattern
designated by the user. More specifically, in the case where a region is
designated by the user, the image processing section 1-3 sets the center
of a rectangle circumscribing the region as a reference point. Then, the
image processing section 1-3 calculates a rectangle centered around the
reference point and having a width and length each calculated as a
product of the length of the circumscribing and a respective prescribed
ratio (Step S10-8). Thus, the rectangle to be cut off is obtained. In the
case where a position is designated by the user, the center of the
rectangle including a face skin region obtained according to Image
processing procedure 4 is used as a reference point, and similar
processing is carried out to obtain a rectangle to be cut out.
[0139] Image Processing Procedure 11
[0140] The image processing section 1-3 magnifies or reduces the face
image which is cut out according to Image processing procedure 10 to an
appropriate size, and stores the resultant image in the output image
storing section 1-2-3 of the storage apparatus 1-2. The image processing
section 1-3 can utilize the stored face image for appropriate
applications such as an address book in a portable information tool. For
example, the image processing section 1-3 stores an image of a person
obtained by a digital camera, such as an image 30-1 as shown in FIG. 31,
in the input image storing section 1-2-1, and roughly designates a
portion in and around the face using the image coordinate input apparatus
1-1. Then, the image processing section 1-3 cuts out an image including
the face at a well-balanced position from the original image according to
Image processing procedure 10, and magnifies or reduces the resultant
image to fit a prescribed frame. Thus, the resultant image is attached to
a document, as shown by an image 30-2 of FIG. 31. The image 30-2 is a
sheet of the address book with the face image being attached thereto.
[0141] Image Processing Procedure 12
[0142] A face mask is obtained according to Image processing procedure 12.
In order to improve the visual recognition of the face in the image, the
image processing section 1-3 of the present example appropriately
processes only a portion of the input image corresponding to a
white-pixel region of the face mask to make the image characteristics of
the face-skin region and the other regions different from each other.
Alternatively, in order to improve the visual recognition of the face in
the image, the image processing section 1-3 may appropriately process
only a portion of the input image corresponding to a black-pixel region
of the face mask to make the image characteristics of the face region and
the other regions different from each other.
[0143] For example, FIG. 32 is a diagram illustrating the image correction
processing. In the case where a face mask 31-2 is obtained from an input
image 31-1, the image processing section 1-3 reduces the sharpness of the
portion of the input image corresponding to the black-pixel region of the
face mask 31-2, using a Gaussian filter or an averaging filter. As a
result, an image 31-3 having reduced visual recognition of the background
other than the face and having improved visual recognition of the face is
obtained. In the case where the input image is not a sharp image, the
image processing section 1-3 improves the visual recognition of the face
by processing the portion of the input image corresponding to the
white-pixel region of the face mask by, for example, edge sharpening. As
a result, an image 31-4 is obtained. Similar effects may be obtained by
reducing the contrast of the image, instead of reducing the sharpness of
the regions other than the face region. In the case where the input image
is a low-contrast image, similar effects may be obtained by increasing
the contrast of the face-skin region. Alternatively, the contrast of the
whole input image may be increased so that the portion of the input image
corresponding to the white-pixel region of the face mask has the highest
contrast.
[0144] According to the present invention, the user roughly designates a
position (or a region) of the object in the original image, whereby an
image which includes the object at a well-balanced position can be cut
out from the original image.
[0145] In one example, the user roughly designates a position (or a
region) of the object in the original image, whereby an image having a
prescribed size and including the object at a well-balanced position can
be output.
[0146] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby a region representing
a face skin can be extracted.
[0147] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby a rectangle including
a region representing the face skin can be obtained.
[0148] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby the central axis of
the face can be detected.
[0149] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby a vertical position of
the nose in the face can be detected.
[0150] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby a vertical position of
the eyes in the face can be detected.
[0151] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby a vertical position of
the mouth in the face can be detected.
[0152] In one example, the user roughly designates a position (or a
region) of the person 's face in the image, whereby a width of the face
can be detected.
[0153] In one example, the user roughly designates a position (or a
region) of the person's face in the original image, whereby an image
which includes the face at a well-balanced position can be cut out from
the original image.
[0154] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby an image having a
prescribed size and including the face at a well-balanced position can be
output.
[0155] In one example, the user roughly designates a position (or a
region) of the person's face in the image, whereby the image quality can
be adjusted so that the visual recognition of the face is improved.
[0156] Various other modifications will be apparent to and can be readily
made by those skilled in the art without departing from the scope and
spirit of this invention. Accordingly, it is not intended that the scope
of the claims appended hereto be limited to the description as set forth
herein, but rather that the claims be broadly construed.
* * * * *