ISSN 0236-235X (P)
ISSN 2311-2735 (E)

Journal influence

Higher Attestation Commission (VAK) - К1 quartile
Russian Science Citation Index (RSCI)

Bookmark

Next issue

4
Publication date:
09 December 2024

Journal articles №2 2023

1. A comparative analysis of methods for constructing mathematical models of object functioning using machine learning [№2 за 2023 год]
Authors: Kovalnogov, V.N. (kvn@ulstu.ru) - Ulyanovsk State Technical University (Head of Department "Thermal and Fuel Energy"), Ph.D; Sherkunov, V.V. (v.sherkunov@ulstu.ru) - Ulyanovsk State Technical University (Postgraduate Student), ; Hussein Mohamed Hussein (mohammedab634@gmail.com ) - Ulyanovsk State Technical University (Postgraduate Student), ; Klyachkin, V.N. (v_kl@mail.ru) - Ulyanovsk State Technical University (Professor), Ph.D;
Abstract: The subject of the study is a technical object; its work is determined by many factors, its performance is charac-terized by some indicator. It is necessary to build a mathematical model that connects this indicator with the values of factors. As an example, the article examines the influence of various factors on the efficiency of burner devices (load, air consumption, methane and biogas, fuel and oxidizer compositions, and others). The efficiency (performance) of the burner device is assessed by the temperature of the flue gases. The problem is solved by machine learning methods, since classical regression analysis methods showed insufficient accuracy. The article explores the effectiveness of the following ap-proaches: the support vector method, random foresting and decision tree boosting. The authors used a localized version 13.3 of the Statistica system for numerical calculations. All three machine learning approaches discussed in the paper have shown a significant increase in the model accuracy on the test sample. The method of boosting decision trees has shown the best results in this example. The recommended model construction technology that provides the necessary forecasting accuracy is first reduced to testing the classical regression analysis (if the resulting model provides the necessary accuracy, then it is preferable from the point of view of its interpretability). If the accuracy is insufficient, the three considered methods of machine learning are used. It this case, it is important to select the parameters of each of the methods, which, on the one hand, would provide the necessary accuracy, on the other hand, would not lead to model retraining. The resulting model can be used to assess the influence of various factors on the efficiency of the technical facility, as well as to predict its functioning quality (in particular in the considered example, to predict the temperature of flue gases).
Keywords: decision tree busting, random forest, support vector machines, multicollinearity, regression model
Visitors: 3085

2. On-the-fly data clustering for the PostgreSQL database management system [№2 за 2023 год]
Authors: Tatarnikova, T.M. (tm-tatarn@yandex.ru) - St. Petersburg State University of Aerospace Instrumentation (Associate Professor, Professor), Ph.D;
Abstract: The paper determines the relevance of the task of real-time data clustering in the form of a dynamically embedded library for the PostgreSQL open-source database management system. There are formulated conditions for performing real-time clustering, which consist in ensuring sufficient performance, in which the time for determining clusters does not exceed the time for writing data to the table and a limited amount of data for clustering. PostgreSQL methods are available in the devel-library, which allows them to be used to interact with data at the internal representation level and other programming languages that perform some operations faster than the SQL query language. The scheme of interaction between elements for clustering includes a database with a dynamically embedded library and the TimescaleDB extension to organize data storage by the database server; an interpreter – a software layer for translating data from the internal representation into the types of the language used before clustering, and vice versa, translating the clustering results into an internal format for saving them to the database; a clusterizer – a program that performs clustering of transmitted data according to an algorithm. The proposed library is an implementation of a trigger function, which in fact is an interpreter that connects the clusterizer with the database. If this is the first function operation for the table, then the initial centroids are selected in the way that the user specified. Otherwise, the centroid data is read from the table. There is a demonstration of the library work. The data set for clustering is randomly generated with a concentration around the given centroid coordinates. The library does not limit the user both in the dimension of points that need to be distributed among clusters, and in the number of tables for inserting data. Due to the computational complexity of the algorithms, there is a limit on the maximum amount of data for clustering.
Keywords: PostgreSQL, centroid method, dynamic link library, DBMS, clustering
Visitors: 3649

3. Developing a program self-assembly mechanism based on sockets [№2 за 2023 год]
Authors: Kol’chugina, E.A. (kea@pnzgu.ru) - Penza State University (Professor of the Department of Mathematical Support and Computer Application), Ph.D;
Abstract: The paper focuses on methods and algorithms of spontaneous self-assembly and self-organization of software systems. Among the artificial chemistry models, there are some methods allowing program self-formation. But these methods are very specific and problematic for integration with conventional widespread and well-known imperative programming tools. Thus, it is necessary to offer other types of tools that enable dynamically establishing relations between programs or executing processes. The method previously proposed by the author is based on using Internet sockets connecting program units of different types. Some of these units are servers, some are clients, and some are of a hybrid client-server type. The units are generally considered as artificial atoms that react with each other and form complex substances (i.e. programs of different structures). This paper proposes the algorithms of such program units. Being implemented, these algorithms allow creating collectives of independent interacting program units capable to form different computing configurations. The designed algorithms are the basis for implementing the concept that allows spontaneous formation of the software in accordance with the specified rules under specified conditions. The experiments resulted in computational structures similar to real-world polymers and capable of pumping data through themselves. The obtained results are necessary for organizing a fully automated software development process based on the simulation of spontaneity. The program development process will require less human involvement and will therefore become more efficient and economically profitable.
Keywords: sockets, chemical reactions and charged particles simulation, self-organization and self-assembly of programs
Visitors: 2634

4. Neural network tool environment for creating adaptive application program interfaces [№2 за 2023 год]
Authors: Tagirova, L.F. (LG-77@mail.ru) - Orenburg State University (Associate Professor), Ph.D; Zubkova, T.M. (bars87@mail.ru) - Orenburg State University, Ph.D;
Abstract: The software is used in almost all areas of human activity. Erroneous actions of the user, which often depend on his emotional state, can lead to negative consequences, especially in production management, technological processes, design activities, medicine, etc. The article is devoted to the problem of personalizing the interface of application programs to user’s individual features based on neural network technologies. The novelty of the approach proposed in the work is the prototype interface formation based selecting each menu item separately, which allows forming a personalized interface. The authors propose using a tool environment, which includes a set of components of the interface part for a dynamically generated unique prototype of the interface adapted to each user features. As a tool for selecting interface components, the authors used a deep neural network presented in the form of a multilayer perceptron. The input parameters of the neural network are the distinctive features of users, the outputs are the components of the future prototype interface. Professional, psychophysiological characteristics of users, their demographic characteristics, as well as emotional state were chosen as criteria for adapting the interface part of applications. The output parameters are interface components: text font size and hyperlinks, size and distance between web page elements, tooltip view and context menu, messages to the user, color scheme, availability of a window for information search, etc. Aы a result, the paper presents a developed instrumental environment for creating personalized application program interfaces using neural network technologies. During the software work, users are evaluated by their characteristics using basic tests of the IT sphere and psychology. To determine the emotional tone, age and gender, the system uses the Python Deepface library, which implements an algorithm based on a trained vertical neural network. The implementation of the proposed instrumental environment will ensure comfortable interaction between users and the application.
Keywords: electronic training system, personalized interface, multilayer perceptron, neural network, artificial intelligence, interface components
Visitors: 3878

5. Features of working with Russian-language ontologies using the Owlready2 library in Python [№2 за 2023 год]
Authors: Shchukarev, I.A. (blacxpress@gmail.com) - Ulyanovsk State Technical University, Institute of Aviation Technology and Management (Associate Professor), Ph.D;
Abstract: The use of domain ontologies to create information systems is currently becoming more and more widespread. Based on ontologies, it is possible to create so-called knowledge bases, which are essential components of most information systems. To work with ontologies, there are various software products, such as Protégé or the Owlready2 module for the Python programming language. As a rule, to search for new knowledge or facts that are in the ontology, a logical inference machine or reasoner is used, which checks it for consistency, i.e. consistency. When working in the Owlready2 library of the Python language with Russian-language ontologies, i.e. Ontologies in which initially all classes, individuals and relationships are written in Cyrillic, reasoner gives incorrect data that is simply unreadable. Due to a failure in the encoding during the operation of reasoner owlready2, firstly, it duplicates the ontology, and, secondly, unreadable characters appear instead of the Cyrillic text. With such data, further actions in Python or Protégé are not possible without additional actions. The article proposes a way to solve this problem by explicitly setting the encoding of the output file after reasoner's work, namely cp1251 encoding, i.e. standard 8-bit encoding for Russian versions of Microsoft Windows. As a result, when working with Russian-language ontologies, it becomes possible to use the full potential of the Owlready2 library of the Python programming language. Therefore, the creation of Russian-language ontologies that can be used as the basis for Russian-language information systems is an urgent task at the present time. The method proposed in the article can be useful for IT specialists involved in the development of information systems based on ontologies of subject areas and when working with ontologies as part of the educational process at a university.
Keywords: russian-language ontologies, owlready2, protege, python, information system, ontology
Visitors: 3904

6. T5 language models for text simplification [№2 за 2023 год]
Authors: Vasiliev, D.D. (dmitriy.vasiliev.0303@gmail.com) - Siberian Federal University (Graduate Student), Undergraduate; Pyataeva, A.V. (anna4u@list.ru) - Siberian Federal University (Associate Professor), Ph.D;
Abstract: The problem of text readability in natural Russian is relevant for people with various cognitive impairments and for people with poor language skills, such as labor migrants or children. Texts constantly surround us in real life, such as various instructions, directions, and recommendations. Increasing the availability of these texts for these categories of citizens is possible by using an automated text simplification algorithm. This article used deep neural architecture transformers as an automated simplification algorithm. The following language models were applied: ruT5-base-absum, ruT5-base-paraphraser, ruT5_base_sum_gazeta, ruT5-base. Experimental studies used two data sets: a data set from the Institute of Philology and Language Communication and data from the open Github repository. The following set of metrics was used to evaluate the models: BLEU, Flesh Readability Index, Automatic Readability Index, and Sentence Length Difference. Further, using a test data set, statistical indicators were extracted from the listed metrics, which became the basis for comparing algorithms with different training parameters. The authors carried out several experiments with these models that used different values of the learning rate parameter for each dataset, batch sizes, and the exclusion of an additional dataset from training. Despite the different metrics, the models outputs did not differ much from each other during manual comparison. The results of experimental studies show the need to increase the data set for model training, as well as the change in the parameters of model training, or the use other algorithms. This study is the first step towards creating a decision support system for automatic text simplification and requires further development.
Keywords: T5 model, deep learning, text simplification, natural language processing
Visitors: 3817

7. Using three-dimensional data cubes in the implementation of a business intelligence system [№2 за 2023 год]
Authors: Chernysh, B.A. (borisblack@mail.ru) - Reshetnev Siberian State University of Science and Technology (Postgraduate Student); Murygin, A.V. (avm54@mail.ru) - Reshetnev Siberian State University of Science and Technology, Department of Information and Control Systems (Professor, Head of Chair), Ph.D;
Abstract: Business analysis is one of the key management tools that allows getting a reliable picture of the current business situation in an enterprise in all areas of its activity. To ensure this process in any company, there are various data used as its performance indicators. The data source is primarily integrated information systems (IIS) of various types (ERP – Enterprise Resource Planning, CRM – Customer Relationship Management, MES – Manufacturing Execution System, etc.) These systems either incorporate business analysis tools (BI – Business Intelligence) or use specialized solutions that allow performing complex analytical tasks according to a given formulation. This article discusses the features of both approaches, their advantages and disadvantages, provides examples of foreign and domestic products for business analysis existing on the market. The authors propose a method for constructing three-dimensional cubes using the data contained in this system on the example of the BI-module developed by the authors of the IIS SciCMS. There are descriptions of the used methods and algorithms, the initial requirements and limitations. The authors have carried out the formalization of tasks and considered the mathematical apparatus for constructing multidimensional data models based on information from a fixed set of normalized tables of a relational database. There are examples of SQL queries and output data. In some cases (working with a non-relational DBMS, the need for precalculated aggregate values, the complexity and high cost of direct SQL queries, etc.), the described method for building multidimensional cubes may not be applicable. The solution to this problem in SciCMS is its own data import and transformation module based on an open source library. The article summarizes the main advantages and disadvantages of the proposed approach, the prospects for its use in domestic enterprises.
Keywords: scicms, database, BI, OLTP, olap, business analysis, analytical processing, three-dimensional cube, relational schema, normalization, normal form, integrated information system, graphql, multidimensional representation
Visitors: 3908

8. Optimal control of non-linear systems via quadratic criteria with bounded controls [№2 за 2023 год]
Authors: Emelyanova, I.I. (emelyanova-123@yandex.ru) - Tver State Technical University; Pchelintsev, A.N. (pchelintsev.an@yandex.ru) - Tambov State Technical University (Associate Professor), Ph.D;
Abstract: The paper suggests a method of developing an optimal control of a single class of nonlinear systems via a quadratic criterion with a bounded type of inequality for the controls. This method is a further derivation from the method of successive approximations suggested in the earlier works of the group of authors, to which the authors of the current paper belong. By modifying the given method, the researchers have managed to state the existence of an optimal control of the problem in question and to synthesize the actual optimal control. The crucial issue of optimal control development is the problem of convergence of the method of successive approximations. Besides, the suggested scheme leads to a computational procedure that implies constructing a solution for a two-point boundary value problem. As known, it causes certain computational difficulties. In order to avoid those difficulties, the paper includes a modified scheme that converges and provides control which is close to an optimal one. It is demonstrated that the developed scheme reduces the initial problem to a sequence of Cauchy problems that can be easily solved using the simplest methods of numerical analysis. To illustrate the suggested method, the paper shows the results of a computational experiment on developing optimal control for a controlled system described with Van der Pol equation. In this case, it turned out that it is the modified scheme that gives the optimal control.
Keywords: successive approximations method, bounded controls, control of non-linear systems via quadratic criterion
Visitors: 2944

9. A statistical experiment to test practical convergence in one submodular programming problem [№2 за 2023 год]
Authors: Skakodub, K.R. (skakodub03@bk.ru) - Tver State University (Junior Researcher); Perevozchikov, A.G. (lesik56@mail.ru) - Tver State University (Associate Professor), Ph.D; Lesik, A.I. (pere501@yandex.ru) - RPA RusBITech-Tver JSC (Senior Researcher), Ph.D;
Abstract: The article discusses a statistical experiment to test practical convergence in a single submodular programming problem. It proposes setting the problem of maximizing the sum of the group assignment effectiveness. The paper introduces the concept of a mixed solution of the transport task of group assignment, when resource constraints are met on average. It is shown that defining mixed solutions to the group assignment transport problem can be reduced to a submodular programming problem, which can be solved by the branch-and-bound method with upper estimates based on the transport problem submodularity with constraints in the form of column equalities. The polynomial nature of the ε-optimal version of the branch-and-bound method has been proved only in relation to the classical scheme for solving the multidi-mensional knapsack problem. We use a scheme that uses the specifics of the problem, therefore, further efforts are needed to test the polynomial hypothesis, including with the help of statistical experiments. The main result of the work is the development of a numerical implementation of the ε-optimal version of the branch-and-bound method in the high-level C++ programming language and conducting a statistical experiment to verify the practical convergence of the algorithm itself based on the statistical transport problem of group assignment by the effectiveness of the assignment. Based on the results of the numerical experiment analysis, it was found that for the problem under consideration, the percentage of vertices revealed during the operation of the ε-optimal algorithm from the total number of vertices in the orgraph decreases quite quickly with increasing dimensionality, which indicates sufficient efficiency of the algorithm. The polynomial hypothesis has not been confirmed, since the authors did not use a classical algorithm for solving an integer problem, but the specifics of the task.
Keywords: upper bounds of the criterion, branch-and-bound method, polynomial, statistical experiment, mmersion of the original problem into a family of problems, mixed solution, transport problem of group assignment
Visitors: 3209

10. Applying MATLAB in the design of digital filters for selecting Pc5 geomagnetic pulsations [№2 за 2023 год]
Authors: Korobeynikov, A.G. (korobeynikov_a_g@mail.ru) - The National Research University of Information Technologies, Mechanics and Optics (Professor), Ph.D;
Abstract: The paper considers a design procedure of an optimal nonrecursive bandpass digital filter with a finite impulse response (FIR-filter) using the MATLAB Filter Design tool and the method of best uniform (Chebyshev) approximation. The filter helps solving the problem of extracting Pc5 geomagnetic pulsations from a data set of geomagnetic field measurements. This type of pulsations was chosen due to the availability of 1-second data in a widely spaced network of ground-based geomagnetic observatories with standardized ground-based geophysical equipment. After proper processing, these data can be used in a detailed analysis of: the disturbance properties in the Earth's magnetic field in the range of long-period pulsations; the nature of the interaction of waves and particles in the magnetosphere. The results of this analysis can be used, for example, in calculating a space weather forecast, which makes this work relevant. The problem of selecting Pc5 ripples is solved by passing the original data set through a band-pass FIR filter with the required characteristics depending on the range of the Pc5 ripple period – 150÷600 seconds. Hereof it follows the limits of the bandwidth equal to 1.7÷6.7 mHz. The choice of a non-recursive band-pass FIR filter is due to the possibility of providing a linear phase-frequency characteristic that excludes phase distortions at the output of the FIR filter; stability by definition of this type of filters. The first condition also ensures that there are no requirements for the phase response of the FIR filter. The operability of the obtained digital filter is demonstrated on the example of processing a real set of measurement data of the geomagnetic field state obtained from the Lycksele geomagnetic observatory (Sweden, Geological Survey of Sweden, international IAGA code LYC), which is a part of the INTERMAGNET international network. The necessary information about this observatory is available on the Internet: https://www.intermagnet.org. Bandpass FIR filter design and calculations were carried out in MATLAB R2022b.
Keywords: INTERMAGNET, PC5, geomagnetic ripple, bandpass filter, FIR-filter, Filter Design, matlab
Visitors: 3211

| 1 | 2 | Next →