What is digital content. How digital printers work

SUBSTANCE: invention relates to signal presentation technology. The technical result is the extension functionality. The system for generating a compact description of digital materials contains an acquisition module configured to receive digital material, a segmentation module configured to divide said material into a plurality of areas, a calculation module configured to generate feature vectors for each area from said set, wherein the feature vectors calculate based on matrix invariances including singular value decomposition, an inference module configured to generate an output using a combination of computed feature vectors, wherein the output generates a hash value vector for that digital material, where the hash value vector is a compact representation of the digital material, thus identifying the digital material based on said compact representation. 2 n. and 7 z.p. f-ly, 3 ill.

Drawings to the RF patent 2387006

The field of technology to which the invention belongs

This invention relates generally to signal presentation technology.

State of the art

Digital content is often distributed to consumers over private and public networks such as an intranet or the Internet. In addition, these materials are distributed to consumers via fixed, computer-readable media such as a compact disc (CD-ROM), a digital versatile disk (DVD), a magnetic floppy disk, or a magnetic hard disk (eg, a preloaded hard disk).

Unfortunately, it is relatively easy for a person to pirate the original digital content (content) of digital material at the expense and expense of the owners of that content, which include the content's author, publisher, developer, distributor, etc. The content-based industries (eg entertainment, music, films, software, etc.) that produce and distribute content are plagued by constant loss of revenue due to digital piracy.

"Digital Materials" is a general term used in this application to refer to electronically stored or transmitted content (content). Examples of digital materials include images, audio clips, videos, multimedia information, software, and data. Depending on the context, digital materials may also be referred to as " digital signal', 'content signal', 'digital bitstream', 'multimedia signal', 'digital object', 'object', 'signal' and the like.

In addition, digital materials are often stored in massive databases - either structured or unstructured. As these databases grow, the need for rationalized categorization and identification of materials increases.

Hashing

Hashing technologies are used for many purposes. Among these goals are protecting the rights of content owners and improving the speed of searching/accessing databases. Hashing technologies are used in many areas such as database management, querying, cryptography, and many other areas involving large amounts of raw data.

In general, hashing technology maps (transforms) a large block of raw data into a relatively small and structured set of identifiers. These identifiers are also called "hash values" or simply "hash". By introducing a special structure and order to the raw data, the hash function greatly reduces the size of the raw data into a smaller (and usually more manageable) representation.

Conventional hashing limitations

Conventional hashing techniques are used for many kinds of data. These technologies have good performance and well understood. Unfortunately, digital materials with visual and/or audio content present a unique set of features not found in other digital media. This takes place mainly because unique fact that the content of such materials is subject to perceptual assessment (assessment through perception) by human observers. Typically, the perceptual evaluation is visual and/or auditory.

For example, suppose that the content of two digital materials is actually different, but from a perceptual point of view, this is not significant. A human observer may view these contents of the two digital materials as similar to each other. However, even perceptually insignificant differences in content properties (such as color, pitch, intensity, phase) between two digital materials result in two materials (products) appearing to be significantly different in the digital realm.

Thus, when using the normal hashing function, a slightly modified version of the digital material generates a significantly different hash value compared to the hash value of the original digital material, even though the digital material is essentially identical (i.e., perceptually the same) to human observer.

The human observer is quite tolerant to certain changes in digital materials. For example, human ears are less sensitive to changes in audio signal components in some frequency bands than components in other frequency bands.

This human tolerance can be exploited (by pirates) for illegal or unscrupulous purposes. For example, a pirate may use advanced audio processing techniques to remove copyright notices or embedded watermarks from an audio signal without a perceptible change in the quality of the audio signal.

Such malicious modifications of digital materials are called "attacks" and result in changes to the data area. Unfortunately, the human observer is unable to sense these changes, allowing pirates to successfully distribute unauthorized copies in an illegal manner.

While the human observer is tolerant of such small (ie, imperceptible) changes, the observer of digital information - in the form of conventional hashing technology - is not. Traditional hashing techniques do little to identify the overall content of the original digital material and pirated copy such material, because hashing the original and the pirated copy results in very different hash values. This is true even though both are perceptually identical (i.e., they appear the same to a human observer).

Applications of hashing technologies

There are many and varied applications of hashing technologies. Some include anti-piracy, content categorization, content recognition, watermarking, content-based key generation, and synchronization in audio and video streams.

Hashing technologies can be used to search the Web for digital material suspected of being pirated. In addition, hashing techniques are used for content-based signal key generation. These keys are used instead of or in addition to the secret keys. Hash functions can also be used to synchronize input signals. Examples of such signals include video or multimedia signals. The hashing technology must be fast if the synchronization is done in real time.

The essence of the invention

Described in this application is an implementation that gives a new representation of digital material (such as an image) in a new defined presentation area. In particular, these representations in this new area are based on matrix invariances. In some implementations, these matrix invariances may, for example, rely heavily on singular value decomposition (SVD).

Brief description of the drawings

Like reference numerals are used throughout the drawings to refer to like elements and features.

1 is a block diagram showing the described methodological implementation.

Fig. 2 is a block diagram of the described implementation.

3 is an example of a computer operating environment that allows (full or partial) implementation of at least one described embodiment.

Detailed description

In the following description, specific numbers, materials, and configurations are set forth for purposes of explanation in order to provide a thorough understanding of the present invention. However, one skilled in the art will appreciate that the present invention may be practiced without these specific illustrative details. In other instances, well-known features have been omitted or simplified in order to make the description of exemplary embodiments of the present invention clear and thereby better explain the present invention. Moreover, for ease of understanding, some steps of the method are highlighted as separate steps; however, these separately highlighted steps should not be construed as necessarily dependent on the order in which they are performed.

The following description discloses one or more exemplary implementations of a Digital Material Representation based on Matrix Invariances that contain the elements listed in the appended claims. These implementations are described in such detail as to meet the prescribed requirements for description, implementability, and disclosure of the underlying the best way implementation of the invention. However, this description itself is not intended to limit the scope of this patent.

The illustrative implementations described below are examples. These illustrative implementations do not limit the scope of the claimed present invention; rather, the present invention may also be embodied and implemented in other ways in connection with other current or future technologies.

One example of an implementation of a Digital Material Representation based on matrix invariances may be referred to as an "exemplary material representer".

When referring to randomization, it should be understood that this randomization is performed by a pseudo-random number generator (eg, RC4) whose seed is a secret key (k), where this key is unknown to the adversary.

Introduction

One or more illustrative implementations of this invention, described below, may be implemented (in whole or in part) on computer systems and computer networks, similar to the one shown in Fig.3. While implementations can have many applications, cryptosystems, authorization, and security are examples of specific applications.

An exemplary material representer derives vectors of robust characteristics of digital materials from pseudo-randomly selected quasi-global areas of these materials by means of matrix invariances. Such regions may (but need not) overlap.

Unlike conventional approaches, the calculations in the exemplary material representer are based on matrix invariances (such as those based on singular value decomposition (SVD)). SVD components cover the essential characteristics of digital materials.

Quasi-global characteristics

Quasi-global characteristics are representatives (typical representations) general characteristics groups or collections of individual elements. For example, they may be statistics or features of "areas" (ie, "segments"). Quasi-global characteristics are not representatives (representations) of individual local characteristics of individual elements; rather, they are representatives of the perceptual (perceived) content of the group (eg, segments) as a whole.

Quasi-global characteristics can be determined (given) by means of a mathematical or statistical representation of the group. For example, it could be the average of the color values ​​of all the pixels in a group. Therefore, such quasi-global characteristics may also be referred to as "statistical characteristics". Local characteristics do not represent robust statistical characteristics.

Notation

Below, uppercase letters (e.g. A, B, C) represent matrices, lowercase letters with vector notation (e.g. ~a, ~b, ~c) represent column vectors, and lowercase letters represent scalars (e.g. a, b, c ). The secret key is represented by k.

The following mathematical definitions are used here:

Two-dimensional representation of digital materials of size n x n.

The identity matrix of size n x n.

- a matrix that represents the i-th pseudo-random area (for example, a rectangle of size m x m) taken from digital materials.

Matrix A transposition.

The Frobenus norm of a matrix A, defined as

where a k,l is the element A in row k and column l.

Hermitian conjugate matrix for matrix A. Note that A H =A L for real matrices.

L 2 is the vector norm, which is defined as

where is the kth element ~ .

- DCT transformation matrix of size m for 1-dimensional signals of length m. Note that the 2-dimensional DCT transformation of the matrix I (size m x m) is defined as

- DWT transformation matrix of size m for 1-dimensional signals of length m. Note that the 2D DWT transformation of the matrix I (size m x m) is defined as

The Hamming weight of a binary vector ~a.

SVD matrices defined as:

Orthogonal eigenvectors of the matrix AA H (and in general case may not be unique (unambiguous)). are called left singular vectors of A.

Orthogonal eigenvectors of an A H A matrix (and may not be unique in general). are called right singular vectors of A.

- : An m x m diagonal real matrix, where the i-th diagonal element, a i , is called the i-th singular value. Without loss of generality, we can assume that

Singular Decomposition (SVD)

The exemplary material representer captures the essence of geometric information while providing dimensionality reduction. SVD has some provable optimality properties: the "better" lower-dimensional (say K-dimensional) approximation of a matrix (say rank N, N>=K) in the sense of the Frobenus norm is provided by the first K singular vectors and the corresponding singular values.

The essence of quasi-global properties and the geometric information of digital materials (such as images) are compactly captured by the meaningful SVD components of such materials. Such components are approximately invariant under intentional or unintentional perturbations as long as the digital materials of interest are not perceptually altered too much.

By means of an exemplary material presenter, SVD is applied to pseudo-randomly selected quasi-global areas of images, mainly for security reasons. SVD components derived from these regions accurately represent the overarching properties of digital materials and have suitable robustness properties while still providing reasonable security as long as a sufficient number and size of regions are used.

The usual choices were DCT (Discrete Cosine Transform) and DWT (Discrete Wavelet Transform http://www.multitran.ru/c/m.exe?a=sa&t=1230948_1_2&sc=134). With DCT and DWT, digital materials are projected into a fixed set of fixed basis vectors. DCT/DWT have been proven to be generally effective for conventional material processing applications.

Instead of fixed basis DCT/DWT type transformations, the exemplary material representer uses Singular Decomposition (SVD). In the case of SVD, the exemplary material representer selects the optimal basis vectors in the sense of the L 2 norm (see equation (1) below). Moreover, for a given matrix, its SVD is unique. As an analogy, if digital material is represented by a vector in some high-dimensional vector space, then singular vectors provide information about the optimal direction with respect to the material in the sense of equation (1), while singular values ​​provide information about the distance along that direction. Therefore, singular vectors that correspond to large singular vectors are naturally susceptible to any scaling attack and other small modifications to conventional signal processing.

Using SVD decomposition, digital materials can be viewed as a two-dimensional surface in three-dimensional space. When DCT-like transformations are applied to a digital material (or surface), information about any particularly distinctive (hence important) geometric property of the digital material is distributed across all coefficients.

For example, an image may have a surface with strong peaks (for example, very bright patches on dark background) to be distributed across all transformations in the case of DCT. Using SVD, the exemplary material representer stores both the magnitude of these important properties (in singular values) and their location and geometry in singular vectors. Therefore, the combination of the largest left and right singular vectors (ie, those corresponding to the largest singular values) captures important geometric properties in the image in the sense of the L 2 norm.

SVD Properties

The mathematical properties of SVD are described below. Let is SVD for A. Then

1) Left singular vectors are an orthogonal basis for the column space A.

2) Right singular vectors are an orthogonal basis for the row space A.

where and

where are singular values, the corresponding singular vectors.

Hashing

The hash function used by the exemplary content renderer is passed input values ​​of a digital content (such as an image) I and a secret key k. This hash function generates a short vector from the set of cardinality 2 k . It is desirable that the perceptual hash value be identical with a high probability for all perceptually similar digital materials. It is also desirable that two perceptually different digital materials generate unrelated hash values ​​with a high probability. Such a hash function is a many-to-one transformation. On the other hand, for most applications, it may be sufficient to have approximately similar (respectively different) hash values ​​for perceptually similar (respectively different) input values ​​with high probability, i.e. this hash function may exhibit gradual change.

The requirements for such a hash function are given as:

1) Randomization: For any given input value, its hash value must be approximately evenly distributed among all possible output values. The probability measure is given by the secret key.

2) Pairwise Independence: The output hash values ​​for two perceptually different digital materials must be independent with high probability, where the probability space is given by the secret key.

3) Invariance: For all possible acceptable perturbations, the output value of the hash function must remain approximately invariant with high probability, where the probability space is given by the secret key.

Two digital materials are considered to be perceptually similar when there are no sufficiently noticeable differences between them in terms of human perception.

Methodological implementations of the illustrative

1 shows a methodological implementation of an exemplary material presenter. This methodological implementation can be done with software, hardware, or a combination thereof.

At step 110, the exemplary content presenter receives input digital materials. For this description, the input digital materials are an n x n image, which can be described as Note that this image can also be rectangular (ie, the dimensions can be different). This approach can be generalized to this condition without difficulty.

In step 120, the exemplary content renderer pseudo-randomly generates multiple regions from I. The number of regions may be p and the shape of these regions may be, for example, a rectangle. The shape of these regions may vary from implementation to implementation.

Although not required, these regions may overlap with each other. However, there may be an implementation that requires such an overlap. Conversely, there may be an implementation that does not allow overlap.

A i is a matrix that represents the i-th pseudo-random region (eg, a rectangle of size m x m) taken from digital materials. Note that each of these regions can be a matrix of different sizes and this can be easily used in this approach without difficulty.

In step 130, feature vectors are generated (each of which can be denoted from each region A i by an SVD-based transformation. This generation of feature vectors can be generally described as

These feature vectors may be used as hash values ​​after appropriate sampling, or they may be used as intermediate features from which actual hash values ​​can be generated. The SVD based transform is a hash function that uses the SVD. Examples of hash functions are described below in the section entitled "SVD-based hash functions".

In this step, the exemplary material representer generates a representation (a collection of feature vectors generated by digital materials. Some implementations may end at this stage with the combination to form a hash vector.

In these implementations, it can be designed to give the upper q singular values ​​from the rectangle A i . Another possibility is to create such that gives the upper q singular vectors (left, right, or both). They are q singular vectors that correspond to the largest q values. Naturally, in both cases the parameter q must be chosen correctly; for example, a logical solution might require q<

In some implementations, p=1 and A i may be chosen to fit the entire image. Note that this option does not have any randomness; therefore, it is more suitable for non-adversarial (non-conflicting) image hashing applications.

Alternatively, other implementations may perform additional processing to generate even smoother results. Steps 140, 150, 160 and 170 show this.

In step 140, the exemplary content renderer generates a secondary representation J of the digital materials by using a pseudo-random combination of vectors characteristics. At this point, these vectors generated as part of step 130 may be considered "intermediate" feature vectors.

As part of such generation of the secondary representation J, the exemplary material presenter collects the first left and right singular vectors that correspond to the largest singular value from each subsection.

Let where (respectively be the first left (respectively right) singular vector of the i-th subsection. Then the illustrative material representer pseudo-randomly generates a smooth representation J from the set Γ: Given a pseudo-randomly chosen initial singular vector, J continues to be formed by choosing and replacing subsequent vectors from G such that the next chosen vector is the closest to the previous vector in the sense of the L 2 norm.

Therefore, after 2p steps, all elements of G are pseudo-randomly reordered and J (of size m x 2p) is formed. Note that the metric L 2 can be replaced by any other suitable metric (possibly randomized) in the formation of J, so that continuity and smoothness are achieved. The smooth nature of J may be desirable in some implementations.

Also note that instead of this simple pseudo-random reordering of the vectors, it is possible to apply other (perhaps more complex) operations to generate J.

In step 150, the exemplary content renderer pseudo-randomly generates multiple regions from J. The number of regions may be referred to as r, and the shape of these regions may be, for example, rectangular. This shape of the regions may differ from implementation to implementation. As with the regions described above, these regions may be of any shape and may overlap (but need not be).

This action is represented by: B i is a matrix that represents the i-th pseudo-random region (eg a rectangle of size d x d) taken from the secondary representation J of these digital materials. Note that in this implementation, the rectangles can have different sizes. In other implementations, the rectangles may be the same size.

At step 160, a new set of feature vectors is generated (each of which can be denoted from each region B i by an SVD-based transformation. This generation of feature vectors can be generally described as

These feature vectors are hash values. The SVD based transform is a hash function that uses the SVD. Example hash functions are described below in the section entitled "SVD-Based Hash Functions". These transformations (T 1 and T 2) based on the SVD may be the same or different from each other.

In step 170, the exemplary material presenter combines the feature vectors of this new set to generate a new hash vector that generates an output value that includes this combination of vectors.

Hash functions based on SVD

This section describes several hash functions that can be used by the SVD-based transforms (T 1 and T 2 ) introduced above in connection with FIG.

SVD-SVD Hash Functions

Given an image, for example, the exemplary content presenter pseudo-randomly selects p sub-images The exemplary content renderer then finds the SVD of each sub-image:

where U i , V i are real left and right matrices of m x m singular vectors, respectively, and S i is a real diagonal m x m matrix consisting of singular values ​​along the diagonal.

After generating the secondary representation at step 140, the exemplary content presenter again applies the SVD to the subsections B i . As a hash vector, the exemplary material representer stores a corresponding set of the first r left and right singular vectors from each B i after appropriate sampling.

As a variant of the SVD-SVD approach, the exemplary material renderer uses the 2D-DCT transform as the initial transform (Tl) in step 130. After finding the 2D-DCT for each sub-image Ai

only the upper frequency range from the coefficient matrix D i is stored. Here D denotes the DCT transformation matrix. Selecting from and determines the selected frequency range. Low to mid range frequency ratios are more descriptive and distinctive for images. The choice avoids frequencies close to the DC fluctuation frequency, which are more sensitive to simple scaling or DC level changes. Choosing a small value avoids the use of higher frequency coefficients, which can be modified by adding low noise, smoothing, compression, etc. Therefore, appropriate values ​​can be selected depending on the specific problem.

The coefficients in this frequency range are then stored as a vector for each area A i . The ordering of the ~(d i ) elements is up to the user and can possibly be used to introduce additional randomness. Then a secondary representation is formed, following the same path, by choosing random vectors from the set and pseudo-randomly generating a smooth representation of J. Then, an exemplary material representer applies SVD to J:

as hash vectors.

This is a variant of the DCT-SVD approach where 2D-DCT is replaced by 2D-DWT. After obtaining random rectangles A i from the image, an l-level DWT is applied to each A i . DC subranges are stored as vectors ~ to form a secondary representation of J in the next stage. Then SVD is applied to J:

First left and right singular vectors corresponding to the largest singular value are stored as hash vectors after appropriate sampling.

Binary SVD

Instead of operating on the original area, the exemplary content renderer generates a binary representation from the original image, preserving meaningful areas of these digital materials. If these materials are an image, this approach may define a threshold for image pixels, where the threshold level is chosen such that only t percent of the image pixels are represented by ones (or zeros). Alternatively, this threshold level may be chosen such that in each sub-image, only t percent of the pixels in the image are ones (or zeros).

Given an image I, the binary image after thresholding can be represented as I b , and to match the largest singular value, the first left and right singular vectors can be defined as

where - binary vectors and binary operation XOR. Other singular vectors can be found alternatively, so that the (k+1)th singular vector pair is derived from for summation.

Therefore, after thresholding, the first binary singular vectors for each binary sub-image are found and form a set. After generating the secondary binary representation J b in the second stage, the exemplary material renderer continues to use binary SVD on r pseudo-randomly selected areas. The final value is given by

Direct SVD

T l can be used as an identity transform and use subsections directly. This idea is easily applicable to binary digital materials (such as binary image I b) that can be generated after thresholding. From each subsection A i of size m x m, vectors ~ are formed directly from samples of materials. Secondary representation J is generated directly from The exemplary material renderer then applies the SVD to J:

and preserves the first left and right singular vectors like hash vectors.

Exemplary System for Generating Representations of Digital Materials

2 shows an exemplary system 200 for generating a digital content representation, which is an exemplary embodiment of an exemplary content presenter.

System 200 generates a representation (eg, hash value) of the digital material. In this example, the digital material is an image. The system 200 includes a material acquisition module 210, a partition module 220, an area statistics calculation module 230, and an output device 240.

The material acquisition module 210 receives the digital material 205 (such as an audio signal or a digital image). It can receive material from almost any source, such as a storage device or a network link. In addition to obtaining, the materials obtaining module 210 may also normalize the amplitude of these materials. In this case, it may also be called an amplitude normalizer.

The splitter 220 separates the materials into a plurality of pseudo-randomly sized areas (ie, splits) of pseudo-random size. Such areas may overlap (but such an overlap is not necessary).

For example, if this material is an image, it can be split into two-dimensional polygons (eg areas) with pseudo-random sizes and locations. In another example, if the material is an audio signal, a two-dimensional representation (using frequency and time) of this audio clip can be divided into two-dimensional polygons (eg, triangles) with pseudo-random sizes and locations.

In this embodiment, these regions do indeed overlap with each other.

For each area, area statistics calculation module 230 calculates the statistics of the set of areas generated by splitter 220 . Statistics for each region are calculated. These statistics computed by calculation module 230 may be the feature vectors described above in the description of steps 130 and 160.

The output device 240 presents the results (per area or combined) of the area statistics calculation module 230 . Such results may be stored or used for further calculations.

Application examples for illustrative

means of presenting materials

The exemplary material representer may be useful for various applications. Such applications may include adversarial and non-adversarial scenarios.

Some non-adversarial applications may include the problems of searching signal databases, monitoring signals in non-adversarial environments. In non-competitive applications, applying this approach to the entire image may provide favorable results. In addition, another application of this algorithm can be several applications in certification: in order to compactly describe the distinctive features (face images, images of the iris, fingerprints, etc.) of a person, the application can be the use of their hash value, where these hash values ​​are generated by the exemplary content representer.

Illustrative computer system and environment

3 illustrates an example of a suitable computing environment 300 in which the exemplary content renderer described above may be implemented (either in whole or in part). Computer environment 300 may be implemented in the computer and network architectures described below.

The exemplary computing environment 300 is only one example of a computing environment and is not intended to suggest any limitation as to the scope or functionality of these computer and network architectures. The computer environment 300 is also not to be interpreted as having any dependency or requirement related to any one or combination of the components illustrated in the exemplary computer environment 300.

The exemplary content presenter may be implemented in a variety of other general purpose or special purpose computer system environments or configurations. Examples of well-known computer systems, environments, and/or configurations that may be suitable for use include, but are not limited to, personal computers, server computers, thin clients, thick clients, handheld or portable devices, multiprocessor systems, microprocessor systems , set-top boxes, programmable consumer electronics, networked personal computers, minicomputers, mainframe computers, distributed computing environments, which may include any of the above systems or devices, and the like.

An exemplary content presenter can be described in the general context of processor-executable instructions, such as program modules executable by a computer. In general, program modules include procedures, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. The exemplary content presenter may be used in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may reside in both local and remote computer storage media, including storage devices.

Computing environment 300 includes a general purpose computing device in the form of computer 302. Components of computer 302 may include, but are not limited to, one or more processors or processor devices 304, system memory 306, and a system bus 308 that connects various system components. , including processor 304, to system memory 306.

System bus 308 is any one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures may include CardBus, Personal Computer Memory International Association (PCMCIA) board, accelerated graphics port (AGP), small computer system interface (SCSI), universal serial bus (USB), IEEE 1394, local the Video Electronics Standards Association (VESA) bus; and the Peripheral Device Interconnect (PCI) bus, also known as the Mezzanine bus.

Computer 302 typically includes a plurality of processor-readable media. Such media can be any available media that computer 302 has access to and includes both volatile and nonvolatile media, removable or non-removable media.

System memory 306 includes processor-readable media in the form of volatile memory, such as random access memory (RAM) 310, and/or non-volatile memory, such as read only memory (ROM) 312. Basic input/output system (BIOS) 314 containing basic routines that help transfer information between elements in computer 302, such as during startup, is stored in ROM 312. RAM 310 typically contains data and/or program modules that are directly accessible and/or currently being processed by processor device 304.

Computer 302 may also include other removable/non-removable, volatile/nonvolatile computer storage media. As an example, FIG. 3 illustrates a hard disk drive 316 for reading from or writing to a fixed non-volatile magnetic media (not shown), a magnetic disk drive 318 for reading from or writing to a removable non-volatile magnetic disk 320 (e.g., a "floppy disk"). and an optical disc drive 322 for reading from and/or writing to a removable non-volatile optical disc 324 such as a CD-ROM, DVD-ROM, or other optical media. A hard disk drive 316, a magnetic disk drive 318, and an optical disk drive 322 are each connected to the system bus 308 via one or more media interfaces 326. Alternatively, hard disk drive 316, magnetic disk drive 318, and optical disk drive 322 may be connected to system bus 308 via one or more interfaces (not shown).

These drives and associated processor-readable media provide non-volatile storage of computer-readable instructions, data structures, program modules, and other data to computer 302. Note that other types of processor-readable media that can store data and that can be accessed by a computer, such as magnetic cassettes or other magnetic storage devices, flash memory cards, CD-ROMs, digital versatile disks (DVDs), or other optical storage devices, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), and the like may also be used to implement the exemplary computer system and environment.

Any number of program modules may be stored on hard disk 316, magnetic disk 320, optical disk 324, ROM 312, and/or RAM 310, including, for example, an operating system 326, one or more application programs 328, other program modules 330, and program data 332. .

A user may enter commands and information into the computer 302 via input devices such as a keyboard 334 and a pointing device 336 (eg, a "mouse"). Other input devices 338 (not specifically shown) may include a microphone, joystick, game pad, satellite dish, serial port, scanner, and/or the like. These and other input devices are connected to the processor device 304 via I/O interfaces 340 that are connected to the system bus 308, but may be connected via other interfaces and bus structures such as a parallel port, game port, or universal serial bus (USB).

A monitor 342 or other type of display device may also be connected to the system bus 308 via an interface such as a video adapter 344. In addition to the monitor 342, other peripheral output devices may include components such as speakers (not shown) and a printer 346 that may be connected to the computer 302 through the input/output interfaces 340.

Computer 302 may operate in a networked environment using logical connections to one or more remote computers, such as remote computing device 348. As an example, remote computing device 348 may be a personal computer, laptop, server, router, network computer, peer device or another ordinary network node, etc. Remote computing device 348 is shown as a portable computer, which may include many or all of the features and features described in connection with computer 302.

Logical connections between computer 302 and remote computer 348 are shown as a local area network (LAN) 350 and a wide area network (WAN) 352. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet. Such network environments may be wired or wireless.

When implemented in a local area network (LAN) environment, computer 302 is connected to local area network 350 via a network interface or adapter 354. network 352. Modem 356, which may be internal or external to computer 302, may be connected to system bus 308 via I/O interfaces 340 or other suitable mechanisms. It should also be appreciated that the network connections shown are illustrative and that other means of establishing communication(s) between computers 302 and 348 may be used.

In a networked environment, such as the computer environment 300 shown, program modules shown for the computer 302, or a portion thereof, may be stored in a remote storage device. As an example, remote application programs 358 reside on the storage device of the remote computer 348. For purposes of illustration, application programs and other executable software components, such as an operating system, are shown here as discrete units, although such programs and components are understood to reside at different times. on various storage components of the computing device 302 and are executed by the computer's data processor(s).

Processor-executable instructions

An implementation of an exemplary content presenter may be described in the general context of processor-executable instructions, such as program modules, executable by one or more computers or other devices. In general, program modules include procedures, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. Typically, the functionality of the software modules can be combined or distributed as needed in various embodiments.

Illustrative operating environment

3 illustrates an example of a suitable operating environment 300 in which an exemplary content renderer may be implemented. More specifically, the exemplary content renderer(s) described above may be implemented (in whole or in part) by any of the software modules 328-330 and/or operating system 326 depicted in FIG. 3, or portions thereof.

This operating environment is only an example of a suitable operating environment and is not intended to impose any restrictions on either the scope or use of the functionality of the exemplary content presenter described above. Other well-known computer systems, environments, and/or configurations that are suitable for use include, but are not limited to, personal computers (PCs), server computers, handheld or portable devices, multiprocessor systems, microprocessor systems, programmable consumer electronics, wireless telephones and equipment, general purpose and special purpose equipment, application specific integrated circuits (ASICs), networked PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

Processor-readable media

An implementation of the exemplary content presenter may be stored on or transmitted over some form of processor-readable media. Processor-readable media can be any available media that can be accessed by a computer. By way of example, processor-readable media may include, but are not limited to, "computer storage media" and "communication media."

"Computer storage media" includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions (commands), data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical media, magnetic cassettes, magnetic tape, magnetic disk a drive or other magnetic storage devices, or any other media that can be used to store the necessary information and that can be accessed by a computer.

A "communication medium" typically embodies processor-readable instructions, data structures, program modules, or other data as modulated data signals, such as a carrier signal or other transport mechanism. Communication media also includes any information delivery media.

The term "modulated data signal" means a signal that has one or more parameters set to a certain state or changed in such a way as to encode information in the signal. By way of example, communication media may include, but is not limited to, wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF (radio frequency), infrared, and other wireless media. Combinations of any of the above also qualify as processor-readable media.

Conclusion

Although the present invention is described in language specific to structural features and/or methodological steps, it should be understood that the present invention as defined in the appended claims is not necessarily limited to those specific features or steps that are described. Rather, these specific features and steps are disclosed as preferred forms of this claimed invention.

CLAIM

1. A processor-readable medium having processor-executable instructions that, when executed by their processor, perform a method for identifying digital materials based on their compact description, said method comprising the steps of:

receive digital material,

segment this material into many areas,

characteristic vectors are formed for each region from the said set, wherein the characteristic vectors are calculated based on the invariances of the matrices, including the singular value decomposition,

an output is generated using a combination of the computed feature vectors, wherein the output generates a hash value vector for that digital material, where the hash value vector is a compact representation of the digital material, thus

identifying the digital material based on said compact representation.

2. The carrier according to claim 1, wherein at least some of the regions of said plurality overlap.

3. The carrier of claim 1, wherein said splitting step comprises a pseudo-random segmentation step of said material.

4. The medium of claim 1, wherein said digital materials are selected from the group consisting of a digital image, a digital audio clip, a digital video, a database, and a program image.

5. A computer comprising one or more processor-readable media according to claim 1.

6. A system for generating a compact description of digital materials, containing:

an acquisition module configured to acquire digital material,

a segmentation module configured to divide said material into a plurality of areas,

a calculation module configured to generate feature vectors for each region from said set, wherein the feature vectors are calculated based on the invariances of the matrices, including the singular value decomposition,

an output module configured to generate an output using a combination of the computed feature vectors, wherein the output generates a hash value vector for that digital material, where the hash value vector is a compact representation of the digital material, thereby identifying the digital material based on said compact representation.

7. The system of claim 6, wherein at least some of said plurality of regions overlap.

8. The system of claim 6, wherein said splitter is further configured to pseudo-randomly segment said material.

9. The system of claim 6, wherein said digital materials are selected from the group consisting of digital image, digital audio clip, digital video, database, and program image.

a) Numerical data should be arranged in accordance with the rules for reading statistical tables: reading the lines is carried out from left to right, the graph - from top to bottom. Numbers should be presented in the middle of the graph under each other - units under units, comma under comma.

b) The location of the digital material must be logically justified. For example, groups according to the trait under study should be presented in ascending or descending order of the trait values.

c) Numbers are recommended to be rounded. Rounding of digital data of the same line or column must be carried out with the same degree of accuracy - to an integer, to tenths, hundredths, etc. If all numbers of one line or columns are presented with one decimal place, and one number - with two or more decimal places, then numbers with one decimal place must be supplemented with zero.

d) Numerical data should be presented as concisely as possible. Figures consisting of 7-8 or more decimal places should be rounded to 2-3 decimal places. For example, a unit of measurement such as "rubles" can be translated into "million rubles."

e) If, in the interests of research, you still had to resort to multi-digit numbers, it is recommended to separate different classes of numbers from each other, highlighting millions, thousands, units, etc. with a space (blank).

f) If one of the values ​​is many times greater than the other, the compared indicators should be expressed in times.

Notes and additions. If the table, along with reporting materials, contains calculated information, and also if the table is compiled on the basis of data obtained using various methodologies, then such a table should be supplemented with appropriate explanations. Such additions may be placed before the table, in its title, or directly in the table itself. Also, the table can be provided with a note or footnotes, located, as a rule, below the table. If any of the table data are borrowed, then their source should be indicated.

Conventions.The reasons for the lack of data in the tables are different, in connection with this, a number of symbols have been adopted in statistical practice:

"x" - the position is not to be filled: for example, it is impossible to fill in the cell at the intersection of the line " 5-9 years old» and graphs « number of marriages»;

“…” / “No information” / “N. St." - for some reason, the information is missing;

"–" - the phenomenon is absent;

"0.0" / "0.00" - the numerical value is less than the accuracy accepted in the table.

The final stage of working with a statistical table is its reading and subsequent analysis. Table analysis involves splitting the table into parts, and is subdivided into structural analysis- analysis of the structure of the table and meaningful analysis- analysis of the contents of the table. The study of the table can be carried out line by line - by the method horizontal analysis and graphs - by vertical analysis. The result of analytical work with the table should be conclusions about the studied population as a whole.

Bulletin of the Higher Attestation Committee of the Russian Federation. 1995. - No. 1 (January). - S. 5-6.

4.2. Presentation of tabular material

Digital material, when there is a lot of it or when there is a need to compare and derive certain patterns, is drawn up in the dissertation in the form of tables.

A table is a way of presenting information in which digital or textual material is grouped into columns delimited from one another by vertical and horizontal rulers.

According to the content of the table are divided into analytical and non-analytical. Analytical tables are the result of processing and analysis of digital indicators. As a rule, after such tables, a generalization is made as new (output) knowledge, which is introduced into the text in layers: "the table allows us to conclude that ...", "it is clear from the table that ...", "the table will allow us to conclude, what..." and so on. Often such tables make it possible to identify and formulate certain patterns.

In non-analytical tables, as a rule, raw statistical data are placed, which are necessary only for information or ascertainment.

Typically, a table consists of the following elements: a serial number and a thematic heading, a sidebar, headings of vertical columns (heads), horizontal and vertical columns (the main part, i.e. in the prograph).

The logic of constructing a table should be such that its logical subject, or subject (the designation of those objects that are characterized in it), should be located in the sidebar, or in the head, or in both of them, but not in the prograph, but the logical subject of the table , or predicate (i.e., the data that characterizes the subject) - in the prograph, but not in the head or sidebar. Each heading above a column should refer to all data in that column, and each row heading in the sidewall to all data in that row.

The heading of each column in the head of the table should be as short as possible. It is necessary to eliminate repetitions of the thematic heading in the headings of the columns; eliminate the tier indicating the unit of measurement, transferring it to the thematic heading; put repeated words in unifying headings.

The sidebar, like the head, should be concise. Repeating words should be placed in unifying headings; words common to all headings of the sidebar are placed in the heading above the sidebar. Do not put punctuation marks after the headings of the sidebar.

In the prograph, all repeating elements related to the entire table are placed in the thematic heading or in the heading of the column; homogeneous numerical data are arranged so that their classes match; heterogeneous data puts each on a red line; quotation marks are used only instead of the same words that are one under the other.

The main headings in the table itself are capitalized. Subheadings are written in two ways: with a lowercase letter if they are grammatically related to the main heading, and with a capital letter if there is no such connection. Headings (both subordinate and main) should be as precise and simple as possible. They should not contain repeated words or dimensions.

The vertical column "number in order" should be avoided, in most cases it is not necessary. Very carefully you need to handle the vertical column "Note". Such a column is valid only in those cases when it contains data related to the majority of the table structure.

All tables, if there are several of them, are numbered with Arabic numerals within the entire text. Above the upper right corner of the table, the inscription "Table ..." is placed indicating the serial number of the table (for example, "Table 4") without the No. sign in front of the number and a dot after it. If there is only one table in the text of the dissertation, then the number is not assigned to it and the word "table" is not written. Tables are provided with thematic headings, which are located in the middle of the page and are written in capital letters without a dot at the end.

When transferring the table to the next page, the head of the table should be repeated and the words "Continuation of table 5" should be placed above it. If the head is bulky, it is allowed not to repeat it. In this case, the columns are numbered and their numbering is repeated on the next page. The heading of the table is not repeated.

All data given in the tables must be reliable, homogeneous and comparable, their grouping must be based on essential features.

It is not allowed to place in the text of the dissertation without reference to the source those tables whose data have already been published in print.

Quite often graduate students - authors of Ph.D. dissertations - present digital material in tables, when it is more convenient to place it in the text. Such tables make an unfavorable impression and testify to the inability to handle tabular material. Therefore, before placing any material in the form of a table, one should decide whether it is possible to present it in plain text form.

Digital content is a collection of entertainment that is distributed electronically through special channels for use on digital devices: computers, tablets, smartphones. The main types of modern digital content are text, games, video and audio materials.

To understand what digital content is, just go to any Internet resource or turn on the TV. Everything that you see: programs, series, musical compositions, images - this is digital content. The life of a modern person is inextricably linked with it, and every day we receive a huge stream of digital content.

The concept of digital content

Today, this term is used to describe various areas of the modern market for multimedia goods and products:

  • This is content that is presented in digital or electronic form.
  • This is an activity aimed at distributing content, that is, any multimedia products in the digital environment.
  • Actions aimed at the consumption and further use of content created in electronic form.

In addition to the concepts described above, other definitions are used:

  • Communication operators, for example, Internet providers or mobile operators, understand digital content as a kind of data that has special requests for the quality of the transmission process itself.
  • Producers of multimedia products use the term "digital content" to refer to a collection of materials that cannot be produced without the use of digital technologies, and presentation - without a digital format.

Use of digital content

Use is directly related to delivery and consumption. Materials are delivered via the Internet, or on physical media, via digital television. The modern Internet provides high transmission speeds and extended network bandwidths. Today, most of the traffic is represented by "heavy" multimedia products. In 2016, more than 15% of the world's Internet traffic comes from watching Internet video. This includes viewing via PC, smartphones, tablets and modern TVs. Consumption is carried out through devices for accessing digital content, which we will discuss below.

You can use digital content for a variety of purposes: business (promotion of goods and services), education, entertainment and leisure, communication, etc. If you want to successfully grow your business and use effective advertising tools, simple messages and offers are not enough. The modern user is more than fed up with a variety of content, and wants something fresh, creative.

According to the latest research, videos are the most popular digital content, which means they bring the most income to their creators. The video segment includes digital television, a range of VOD (video on demand) services and online video. 72% of all income received in the electronic content market falls on the video segment. 14% for mobile content, 10% for online games, 3% for audio materials, 1% for e-books.

Most digital content is produced and used in the USA. Next in the ranking are European countries, the states of Asia and the Russian Federation. In the countries of Southeast Asia, popularity is due to high-quality Internet and developed infrastructure. In Western Europe, there has been a steady increase in consumption, but over the past 5 years, sales of video and audio content on physical media have been declining, the audience prefers to make digital purchases of products. In our country today, digital content is mainly developing in the direction of mobile content.

Access devices

Infrastructure is needed to create, distribute and consume digital content. The development and availability of terminals for receiving content contributes to the increase in the consumption of multimedia products. These are the digital devices we use on a daily basis. Every day new technologies appear, the range of digital devices is expanding, their cost is becoming more affordable for consumers. Today it is difficult to find a person who has never heard of a smartphone or tablet. Even in remote rural areas, almost everyone has a smartphone, TV, computer.

Until 2012, mobile devices were not evaluated as a content consumption channel, since media was transmitted via the Internet, physical media, television, but not cellular networks. Today the market relies on the mobile segment, its audience joins the flow of Internet content consumption.

Multi-platforms are also being created to access digital content, such as SmartTV. With it, you can access the Internet and simultaneously watch video through analog or digital television. Game consoles are gaining great popularity today, through which you can access the Internet and play from physical media or online.

Creation of digital content

It is a complex process from the idea of ​​a product to its implementation and further delivery to the user. Anyone can create digital content of mediocre quality, there are a lot of programs and applications for this today. These are various video editors (Windows Movie Maker, SONY Vegas Pro, Pinnacle Video, Editor JahShaka and others), services for developing e-books and animated stories (StoryBird, UtellStory, ACMI Storyboard Generator, etc.).

Windows Movie Maker interface:

However, it is better to entrust the creation of high-quality content, especially when it comes to promotional materials, to professionals. Good specialists have enough experience and knowledge to create materials worthy of the attention of the audience. They also have at their disposal the necessary equipment with high performance and packages of applied highly specialized professional programs, which are usually not found in the public domain.

The term "digital printing" combines technologies that allow you to reproduce an image and text from an electronic file, bypassing the plate processes. There are a large number of different devices for digital printing, from an ordinary desktop printer to industrial sheet and web printing machines and large-format plotters, but they all have one thing in common - no need to output plates and the ability to transfer variable data to the printed material.

Digital technology emerged in the late 1970s with the creation of the first laser printer. Digital printing machines differ from printers in the format of the printed material and the speed of printing: industrial printing machines include devices capable of outputting from 70 ppm.

Digital printing technology

Prepress in the digital method is limited to working with colors, marking and positioning on the printed sheet. The image is exposed directly in the device itself. We can conditionally distinguish two most common types of devices: machines based on the electrostatic (electrophotographic) principle and on the inkjet.

Electrophotography is an image transmission process that involves a photoreceptor drum. A uniform electric charge is applied to its surface. Then the laser weakens the charge in places corresponding to the future image (exposure), the rollers supply toner (special coloring powder), which is attracted to the latent electrostatic image. Highly electrified white areas repel toner. After that, the image from the photoreceptor drum passes to the paper and is fixed under the influence of heat.

Inkjet technology is based on the transfer of paint droplets to areas of the image through thin nozzles. The drops are controlled by charged electrodes, the deviation of which allows you to change the trajectory of the drops or even send them to the trap.


Based on the foregoing, the following advantages of digital printing can be distinguished:

  • Efficiency (you can start printing immediately, without wasting time on form processes);
  • No prepress costs (plate output);
  • Reproduction of variable data (a multi-page document, such as a brochure, can be printed as a separate short run);
  • Independence of the cost of one copy. from the circulation, (therefore, the production of small circulations is beneficial on digital printing machines).

The disadvantages of this method are as follows:

  • Restriction on the use of Pantone inks for printing;
  • Problems with uniformity of paint on large dies;
  • Not a very reliable bond between ink and paper: on folds, for example, when printing leaflets, the toner on the plate will crack;
  • Lower, in comparison with offset, color rendering quality;
  • The high cost of consumables (therefore, printing in medium and large runs is done in offset).

Application of digital printing

Digital printing is used to reproduce small and medium runs of absolutely any kind of printing, from an ordinary business card or leaflet printing, to the creation of brochures, multi-page catalogs and books. This method is used not only for the creation of printed advertising products, book publications or printing of other types of polygraphy - the scope of digital machines is much wider and also includes interior design, outdoor advertising, photographs, reproductions of works of art, use in the textile industry and so on.

Paper for digital printing

For digital printing, special coated and uncoated papers and cardboards, self-adhesive materials (both paper-based and polymer-based) are produced, as well as designer papers, including those with various coatings, textures and other effects. Paper for digital printing should have a high degree of smoothness and clean cut edges.

In addition to paper, digital technologies make it possible to print on fabric, canvas, and film.

A computer