WO2007044828A2

WO2007044828A2 - System and method for edge detection in image processing and recognition

Info

Publication number: WO2007044828A2
Application number: PCT/US2006/039812
Authority: WO
Inventors: Lauren Barghout
Original assignee: Paravue Corporation
Priority date: 2005-10-06
Filing date: 2006-10-06
Publication date: 2007-04-19
Also published as: WO2007044828A3

Abstract

A method and system for detecting edges contained in a digital image are disclosed. In one embodiment, a digital image (2600) is received as an input. The digital image is processed to detect one or more edges (2601) of the digital image as a function of different orientations relative to the frame of the digital image. An edge from the one or more detected edges is eliminated (2603) to create a set of edge element groupings with similar curvilinear orientations and close proximity.

Description

SYSTEM AND METHOD FOR EDGE DETECTION IN IMAGE PROCESSING AND RECOGNITION

FIELD OF THE INVENTION

The field of the invention relates generally to image processing and recognition system and more particularly relates to a system and method for edge detection.

BACKGROUND OF THE INVENTION

Traditionally, edge detection used in digital image processing and recognition utilized either gradient based methods or second order methods. The gradient method detects the edges by looking for the maximum and minimum in the first derivative of the image. The second order method searches for zero crossings in the second derivative of the image to find edges. Many different edge detection methods are available such as Roberts, Prewitt, Sobel, Laplacian, Laplacian of Gaussian and Canny. Each of these methods perform their edge detections in a slightly different manner. These techniques employ function of the actual pixel values of the processed digital images that correspond to ambient light array and are prone to developing non-continuous edges. Often, edge detection using these techniques results in image distortion and produces false edges that differ from human edge detection.

There is a need for providing an improved technique that employs perceptual concept values in edge detection to provide more definite and less noisy edges. There is also a need for edge detection utilizing various perceptual concept values in combination, such as color, texture, symmetry, closure, motion, spatial, contour, etc. attributes.

SUMMARY

A method and system for detecting edges contained in a digital image are disclosed. In the preferred embodiment, a digital image is processed using a bank of Gabor filters at different orientations, phases, and spatial frequencies to detect one or more edge elements, characterized by orientation, spatial frequency, phase, and pixel location. While the luminance channel can be used to generate edge elements, the method is not limited to this choice. Any information channel such as redness or other cognitively relevant value such as figureness can be used to generate its corresponding edge element set. Spurious edge elements, which do not conform to criteria of proximity and similar orientation, are eliminated. The reduced edge element set, referred to herein as a contour map, contains those edge elements that are part of continuous contours. The logical intersection of this contour map with a cognitively relevant parsing from another system results in the final detected edge.

The above and other preferred features, including various novel details of implementation and combination of elements, will now be more particularly described with _ ^ f&fSreilcό^'tMh^'B yj3hipanying%iiyfmgs and pointed out in the claims. It will be understood that the particular methods and systems described herein are shown by way of illustration only and not as limitations. As will be understood by those skilled in the art, the principles and features described herein may be employed in various and numerous embodiments without departing from the scope of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are included as part of the present specification, illustrate the presently preferred embodiment and together with the general description given above and the detailed description of the preferred embodiment given below serve to explain and teach the principles of the present invention.

Figure 1 illustrates a flow diagram of an exemplary edge detection process, according to one embodiment of the present invention.

Figure 2a is an example of a digital image, and Figure 2b is an exemplary intensity map of the digital image's luminance, which maybe processed, according to one embodiment of the present invention.

Figure 3 illustrates examples of Gabor filters with different phases and orientations and an exemplary luminance edge element map, parameterized by the different phases and orientations, produced according to one embodiment of the present invention.

Figure 4 illustrates an example of the cognitively relevant value figureness derived from the image provided in Figure 2a, according to one embodiment of the invention.

Figure 5 illustrates an example of a compressed luminance edge element map, according to one embodiment of the invention.

Figure 6 illustrates an example of attribute correspondence between different attributes of an edge element map, according to one embodiment of the invention. Figure 7 illustrates an example of how attributes of an edge element map are shifted to achieve correspondence, according to one embodiment of the invention.

Figure 8 illustrates an exemplary fuzzy correspondence with a Gaussian membership, according to one embodiment of the invention.

Figure 9a illustrates an exemplary luminance contour maps, according to one embodiment of the invention. Figure 9b illustrates an exemplary edge map, according to one embodiment of the invention.

Figure 10 illustrates an exemplary computer architecture for use with the present system, according to one embodiment. P c T/ Ll S O 6 ^/ 3 ^«3 θ ^Aimn ΠHOTΠPΠΠW

A method and system for detecting edges contained in a digital image are disclosed. In one embodiment, a digital image is received as an input. The digital image is processed by a bank of Gabor filters of different orientations and sizes to detect edges. Edges which do not obey rules of proximity and good curvilinear orientation are eliminated. The final edge is determined via a logical intersection of the remaining edge elements and the region for which the edge is determined.

In the following description, for purposes of explanation, specific nomenclature is set forth to provide a thorough understanding of the various inventive concepts disclosed herein. However, it will be apparent to one skilled in the art that these specific details are not required in order to practice the various inventive concepts disclosed herein.

Some portions of the detailed descriptions that follow are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as "processing" or "computing" or "calculating" or "determining" or "displaying" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

The present invention also relates to apparatus for performing the operations herein. This apparatus maybe specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage

any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories ("ROMs"), random access memories ("RAMs"), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description below. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein.

According to one embodiment, the following terms may have the following meanings without regard to its upper or lower case usage. However, one of ordinary skill would understand that additional embodiments may contemplate additional terms and/or variation of these terms.

An edge element may be an output of a contrast-detecting filter, where contrast may be defined on a cognitively relevant variable such as luminance or figureness. A spurious edge element may be an element which does not belong to a contour. A contour may be a sequence of phenomenological edge elements of close proximity and curvilinear orientation such that they are perceptually grouped as a continuous entity. An edge may correspond to the perceived phenomenological boundary between physical, perceptual, conceptual, or categorical regions.

In other words, the edge element may be an output of a filter. Spurious edge elements may be elements not part of a contour. A contour may be a continuous set of grouped edge elements of locally similar (curvilinear) orientations. An edge may be a contour which corresponds to a boundary between physical, perceptual, conceptual, or categorical regions.

The "edge" definition provided is not to limit the commonly known definition, but rather to serve as explanation for embodiments of the present invention, where an edge may be defined based on any cognitively relevant value. Cognitively relevant values may be characteristics of the visual stimulus that are pertinent to the needs of an observer. Finding cognitively relevant values may rely upon making certain assumptions as to the visual characteristics that are reliable indicators of those features. For example, a decision may be made based on the information available as to which region of the image constitutes the subject of interest, referred to as the "figure," and the region of the image which does not constitute the subject of interest, referred to as the "background" or "ground." For clarification, the terms _ ^ li"ii!gurl'^?4iM 'φbufe''^' ariuOTaMϊtding to the definitions in psychology literature and defined as Gestalt rules of perception. The figure is cognitively relevant information because it is presumed to be the part of an image that a perceiver cares about. Concept modules of a system may each be tuned to extract a certain type of cognitively relevant information, referred to as cognitively relevant values, hi some cases, cognitively relevant values are derived directly from the measured information of an image, such as the redness derived from the red-level value at a particular pixel location. In other cases, cognitively relevant values are derived from other cognitively relevant values, such as edges determined from patterns in luminosity contrast. Figure 1 illustrates the flow of an exemplary edge detection process 26, according to one embodiment of the present invention. Data from digital image 2600 is input into a phenomenological edge detection system, according to one embodiment of the invention. This data can be any cognitively relevant value of the input image, such as luminance, color dominance, symmetry, or figure-ground designation. An example of digital image data 2600 is provided in Figure 2b, which is an exemplary intensity map of the luminance channel (e.g., normalized to pontrast units) of the input image shown in Figure 2a. Digital image data 2600 is processed for edge elements in process 2601. An exemplary method for performing such a process is by convolving the image data with Gabor filters. Gabor filters are parameterized according to orientation, phase, and spatial frequency. Examples of different Gabor filters are shown in Figure 3a, which correspond to the input images shown in Figure 3b.

As shown in Figure 3 a, the luminance channel of the input image is convolved with Gabor filters set at various orientations and phases. The orientation of the Gabor filters assists in locating edges at a particular orientation. The phase of the Gabor filters assists in detecting different types of edge gradients (e.g., light->dark, light->dark->light, dark->light, dark->light- >dark). The images in the top row are created using a Gabor filter phase of zero, and the input channel is convolved with the Gabor filters oriented at different angles. The images in the bottom row are created using a Gabor filter orientation of zero, and the input channel is convolved with the Gabor filters at different phases. Image [1] is created using a Gabor filter at phase = 0 and orientation = 0. Image [2] is created using a Gabor filter at phase = 0 and orientation = 45. Image [3] is created using a Gabor filter at phase = 0 and orientation = 90. Image [4] is created using a Gabor filter at phase = 0 and orientation = 135. Image [5] is created using a Gabor filter at phase = 90 and orientation = 0. Image [6] is created using a Gabor filter at phase = 90 and orientation = 45. Image [7] is created using a Gabor filter at phase = 90 and orientation = 90. Image [8] is created using a Gabor filter at phase = 90 and orientation = 135. Image [9] is created using a Gabor filter at phase = 180 and orientation = 0. , ._{T '} r _In_{1 IP I}. n _CP _"Il ""W

'ϊnla'gd¹ [l^'b]4s¹ϊrllfe'usffiglϋaior¹'filter at phase = 180 and orientation = 45. Image [11] is created using a Gabor filter at phase = 180 and orientation = 90. Image [12] is created using a Gabor filter at phase = 180 and orientation = 135. Image [13] is created using a Gabor filter at phase = 270 and orientation = 0. Image [14] is created using a Gabor filter at phase = 270 and orientation = 45. Image [15] is created using a Gabor filter at phase = 270 and orientation = 90. Image [16] is created using a Gabor filter at phase = 270 and orientation = 135.

Data object 2602, resulting from process 2601, is an exemplary edge element map according to one embodiment of the invention. An edge element map may be either stored (e.g., in a hard drive, server, database) or input directly into the next processing step to generate subsequent cognitively relevant values, such as edge symmetry or edge parallelism. In one embodiment, an edge element map may be contained or stored in a three dimensional matrix, with the first dimension representing a pixel's location on the X-axis of the digital image, the second dimension representing a pixel's Y-axis of the digital image, and the third dimension indexing the convolving Gabor filters, a number of which are illustrated in Figure 3a. The Gabor filters may vary in their phases, orientations, and spatial frequencies. As such, the example currently being described will continue with the use of a three dimensional matrix, even though the use of a three dimensional matrix is not to serve as a restriction to the invention in any way.

Spurious edge elements are removed in process 2603. An exemplary method is to use the mechanism of correspondence, explained in the description of Figure 6 and Figure 7 below, to search for edge element groupings with curvilinear orientation and close proximity. Edge elements that are not able to be grouped in continuous contours may be spurious and removed by process 2603. The resulting data object 2604 may be, according to one embodiment of the invention, a luminance contour map illustrated in Figure 9a. In this particular representation, the map is a two dimensional matrix of maximum (with respect to the flattened, depth dimension) values.

Data object 2605 is an exemplary cognitively relevant value of figureness, as determined from a perceptual processing system according to one embodiment of the invention. An exemplary figureness map, derived from Figure 2a, is illustrated in Figure 4. The image shown in Figure 4 is created by processing the image shown in Figure 2a through color binning and processing a color dominance figure/ground module.

An exemplary method for performing process 2606 is to perform a fuzzy logical AND operation, which results in a fuzzy intersection of the contour map 2604 and cognitively relevant map 2605. l!^;;:" C*^:

edge map produced from process 2606, according to one embodiment of the invention. Though the map shows crisp Boolean values, these values may also be fuzzy membership values. Edges produced from the image processing and recognition system described herein may also be considered new cognitively relevant values. The new cognitively relevant values can be utilized as a new edge parsing, which may be used for later processing.

Though the preceding embodiment employs the steps of receiving digital image data 2600 and processing the digital image data 2600 to produce an edge element map 2602, embodiments of the present invention do not necessarily need to perform these steps. For example, another embodiment of the invention may process a previously created edge element map, or data that is operatively similar to an edge element map, for a digital image. The edge element map, or data that is operatively similar to an edge element map, may then be stored in a three dimensional matrix, or two dimensional matrix of maximum values as an optimization. An exemplary compressed luminance edge element map that corresponds to Figure 2b is provided in Figure 5. A compressed edge element map is a full edge element map which has been compressed across the third dimension by using a max function.

In one embodiment of the invention, an elimination of edge possibilities may occur so as to create a set of edges that obey the perception rules of grouping by proximity and good continuation. In one embodiment of the invention, the edge parsings stored in the edge element map might be used as inputs for further processing to determine various cognitively relevant values.

Processing an edge element map 2602 of Figure 1 may occur in various ways. For example, a fuzzy correspondence system may be used to determine which pixels to delete from the edge element map 2602. hi such a system, correspondence measures the fuzzy co- occurrence of two or more attributes, hi this case, the attributes are two edge elements of similar orientations, one being the shift and one being the base attribute. Correspondence is computed by holding the base attribute steady, and shifting the shift attribute over the base attribute. Correspondence is then computed as a function of base pixel/shift pixel similarity and shift distance, hi the context of computing continuous contours, a correspondence operator is used to determine if a edge element is continuous. This is accomplished by selecting shifts in those orientations that are parallel (or close to parallel) to the edge elements in question. A high correspondence indicates that an edge at a similar orientation occurs within close proximity. This signals that an edge element is part of a continuous contour. Thus, in the case of computing contours, the base attribute is an edge element map, and the shift attribute is also the same edge element map. ,„„, mijui it (I it il""' ιi'"tι H"" "Mr¹¹Il ¹ '" ¹ "1! ""'Il

II^™1 II....- Iⁱ p^'igϊϊre"'^ ϊllu'ltrate's ah example of attribute correspondence between different attributes of an edge element map, according to one embodiment of the invention. The correspondence for a pixel within the base attribute is equal to the minimum shift distance needed to line a pixel up with the shift attribute. An example of shifting is shown in Figure 7. The deviation away from attribute correspondence may be represented as a fuzzy correspondence with a Gaussian membership curve, as shown in Figure 8. Attribute correspondence occurs at the peak, and then drops off steadily as the shift distance increases. The deviation from the center occurs similar to the Gaussian distribution curve. This processing eliminates edges based on the similarity of any two attributes defined along the same scale. Fuzzy correspondence may calculate attribute correspondences that are not necessarily exact, but may be approximate based on a comparison of two or more attribute sets to determine if they co-exist with similar strength or at similar locations.

After the edge element map 2602, or data that is operatively similar, is processed by attribute correspondence or some other process which eliminates spurious edges based on a cognitively relevant value, a contour map 2604 is created. The contour map 2604 is created by utilizing the data produced from process 2603 and is a cognitively relevant value that is dependent on measures from physical value inputs. Figure 9a is an exemplary contour map created based on the cognitively relevant value of luminance contrast and spurious edge deletion 2603. Figure 9b is an exemplary figure edge produced by the logical AND operation performed on the figure input provided by 2605 and contour map 2604, or in this example, Figure 4 and Figure 9a.

Figure 10 illustrates an exemplary computer architecture for use with the present system, according to one embodiment. Computer architecture 1000 can be used to implement the computer systems or image processing and recognition systems described in various embodiments of the invention. One embodiment of architecture 1000 comprises a system bus 1020 for communicating information, and a processor 1010 coupled to bus 1020 for processing information. Architecture 1000 further comprises a random access memory (RAM) or other dynamic storage device 1025 (referred to herein as main memory), coupled to bus 1020 for storing information and instructions to be executed by processor 1010. Main memory 1025 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 1010. Architecture 1000 also may include a read only memory (ROM) and/or other static storage device 1026 coupled to bus 1020 for storing static information and instructions used by processor 1010.

A data storage device 1027 such as a magnetic disk or optical disc and its corresponding drive may also be coupled to computer system 1000 for storing information and instructions. Mniled^'tuSrl SBfe aA'i!ΪΛ$έd to a second I/O bus 1050 via an I/O interface 1030. A plurality of I/O devices may be coupled to I/O bus 1050, including a display device 1043, an input device (e.g., an alphanumeric input device 1042 and/or a cursor control device 1041). For example, web pages and business related information may be presented to the user on the display device 1043.

The communication device 1040 is for accessing other computers (servers or clients) via a network. The communication device 1040 may comprise a modem, a network interface card, a wireless network interface or other well known interface device, such as those used for coupling to Ethernet, token ring, or other types of networks. Although the present method and system have been described in connection with an image processing and recognition system, one of ordinary skill would understand that the techniques described may be used in any situation where it is to detect edges contained in a digital image

A method and system for detecting edges contained in a digital image are disclosed. In one embodiment, a digital image is received as an input. The digital image is processed to detect one or more edges of the digital image as a function of different orientations relative to the frame of the digital image. An edge from the one or more detected edges is eliminated to create a set of edge element groupings with similar curvilinear orientations and close proximity.

Claims

IWC J^ U S O IB / 3 <3815I claim:

1. A computer-implemented method, comprising: receiving as an input a digital image; processing the digital image to detect one or more edges of cognitively relevant values within the digital image; and eliminating spurious edges to create a set of edges as a function of curvilinear orientation and proximity.

2. The computer-implemented method of claim 1, further comprising detecting the one or more edges of the digital image as a function of a figure-ground designation of the digital image.

3. The computer-implemented method of claim 1, further comprising detecting the one or more edges of the digital image as a function of one or more of the following: contrast, edgeness, color, texture, symmetry, closure, parallelism, blur and figure- ground designation of the digital image.

4. The computer-implemented method of claim 1, further comprising creating an edge element map based on the detected one or more edges, the edge element map being contained in a three dimensional matrix.

5. The computer-implemented method of claim 4, wherein data stored in the three dimensional matrix is based on each pixel's location on the Y-axis of the digital image, each pixel's location on the X-axis of the digital image, and each pixel's depth in he digital image.

6. The computer-implemented method of claim 4, wherein eliminating the edge from the one or more edges comprises defining similarities between attributes of the three dimensional matrix and determining if attribute correspondence occurs between the attributes.

7. The computer-implemented method of claim 6, further comprising searching for the attribute correspondence by shifting the attributes until pixels from the different attributes line up to each other.

8. A computer-readable medium having stored thereon a plurality of instructions, said plurality of instructions when executed by a computer, cause said computer to perform: receiving as an input a digital image; processing the digital image to detect one or more edges of cognitively relevant values within the digital image; and ^_{U1 JL ,} ._Jt „.„ _ „ .. _{C Λ}

¹¹ " " elifflϊή'awgφiiridύs' edges to create a set of edges as a function of curvilinear orientation and proximity.

9. The computer-readable medium of claim 8, wherein the plurality of instructions further cause the computer to perform: detecting the one or more edges of the digital image as a function of a figure-ground designation of the digital image.

10. The computer-readable medium of claim 8 , wherein the plurality of instructions further cause the computer to perform: detecting the one or more edges of the digital image as a function of one or more of the following: contrast, edgeness, color, texture, symmetry, closure, parallelism, blur and figure-ground designation of the digital image.

11. The computer-readable medium of claim 8, wherein the plurality of instructions further cause the computer to perform: creating an edge element map based on the detected one or more edges, the edge element map being contained in a three dimensional matrix.

12. The computer-readable medium of claim 11, wherein data stored in the three dimensional matrix is based on each pixel's location on the Y-axis of the digital image, each pixel's location on the X-axis of the digital image, and each pixel's depth in he digital image.

13. The computer-readable medium of claim 11 , wherein the plurality of instructions further cause the computer to perform: defining similarities between attributes of the three dimensional matrix and determining if attribute correspondence occurs between the attributes.

14. The computer-readable medium of claim 13 , wherein the plurality of instructions further cause the computer to perform: searching for the attribute correspondence by shifting the attributes until pixels from the different attributes line up to each other.

15. A computer system, comprising: a processor; and memory coupled to the processor, the memory storing instructions; wherein the instructions when executed by the processor cause the processor to: receive as an input a digital image; processing the digital image to detect one or more edges of cognitively relevant P C TvώUSΗfii tnelfϊti linage; and eliminating spurious edges to create a set of edges as a function of curvilinear orientation and proximity.

16. The computer system of claim 15, wherein the instructions further cause the processor to: detect the one or more edges of the digital image as a function of a figure-ground designation of the digital image.

17. The computer system of claim 15, wherein the instructions further cause the processor to: detect the one or more edges of the digital image as a function of one or more of the following: contrast, edgeness, color, texture, symmetry, closure, parallelism, blur and figure-ground designation of the digital image.

18. The computer system of claim 15, wherein the instructions further cause the processor to: create an edge element map based on the detected one or more edges, the edge element map being contained in a three dimensional matrix.

19. The computer system of claim 18, wherein data stored in the three dimensional matrix is based on each pixel's location on the Y-axis of the digital image, each pixel's location on the X-axis of the digital image, and each pixel's depth in he digital image.

20. The computer system of claim 18, wherein the instructions further cause the processor to: define similarities between attributes of the three dimensional matrix and determine if attribute correspondence occurs between the attributes.

21. The computer system of claim 20, wherein the instructions further cause the processor to: search for the attribute correspondence by shifting the attributes until pixels from the different attributes line up to each other.