Seminar of 9th Alushta-2025 scientific conference participants and presentations by young scientists at 61st PAC NP meeting
Seminars
Meshcheryakov Laboratory of Information Technologies
Joint Laboratory Seminar
Date and Time: Wednesday, 4 June 2025, at 11:00 AM
Venue: room 310, Meshcheryakov Laboratory of Information Technologies
Information about the seminar on Indico
-
Seminar topic: “SPD Online Filter High-Throughput Processing Middleware”
Speaker: Nikita Greben
Abstract:
«SPD Online Filter» is a hardware-software system designed for multi-stage, high-throughput processing of data from the SPD detector. Its main task is the primary processing of data, in order to reduce its volume for long-term storage and subsequent full processing. «SPD Online Filter» consists of a dedicated computing cluster, middleware software, and a set of application-level services. The middleware layer consists of three microservice-based systems that communicate via lightweight API gateways for request routing and a message broker to decouple producers and consumers. Together, they form a configurable, fault-tolerant, and scalable data-processing pipeline. This report illustrates the architecture of the overall system and its constituent subsystems, demonstrates the coordinated interaction among components, and shows how they work together to deliver reliable, scalable, real-time processing of raw detector data to meet the SPD experiment’s requirements.
-
Seminar topic: “SPD Data management”
Speaker: Alexey Konak
Abstract:
Active preparations to launch the SPD experiment at the NICA collider are underway, but research in the field of spin physics has already begun. Researchers are currently working with large amounts of data obtained during the simulation and reconstruction of the studied physical processes. To ensure the reliable storage, distribution and accessibility of this data, a dedicated infrastructure has been deployed. This report describes the current status of the SPD data management: how they are processed, stored and distributed among different data centers. It discusses current processing, the volume of data produced, and storage strategies used. The tools and solutions employed for the data management will also be examined. Through this report, listeners will gain insight into the progress of work on data management for the SPD experiment, as well as any challenges encountered in its processing and long-term storage.
-
Seminar topic: “Pilot applications for distributed task execution in the SPD online filter system”
Speaker: Leonid Romanychev
Abstract:
Pilot applications play a crucial role in distributed computing, enabling dynamic resource management and workload execution. These applications are widely used in high-performance computing and large-scale experiments, providing a flexible mechanism for managing computational tasks. However, the lack of a unified abstraction and best practices has led to the emergence of numerous implementations with varying degrees of portability and efficiency. This talk will explore different architectures of pilot applications, their key components, and operational principles. Special attention will be given to the late-binding mechanism, which allows for dynamic task distribution and improved resource utilization efficiency. Our solution is a two-component system consisting of a pilot and a daemon. It employs a multithreaded approach that accounts for the specifics of the SPD experiment, ensuring task execution, monitoring, and status reporting. The presentation will provide insights into the use of pilot applications in distributed systems and their specific application in the SPD experiment.
-
Seminar topic: “Automation of BM@N Run9 data processing on a DIRAC distributed infrastructure”
Speaker: Igor Pelevanyuk
Abstract:
In spring 2025, the 9th data-taking run is scheduled for the BM@N experiment. Since February 2023, when data from the 8th run were acquired, the BM@N data processing has been carried out using a geographically distributed heterogeneous infrastructure based on the DIRAC Interware software. For the 9th run, an automated task-launching methodology has been developed. The processing is triggered by the appearance of RAW-type files associated with the 9th run in the DIRAC file catalog. A dedicated service periodically checks the catalog for new files requiring processing and initiates the corresponding tasks. Since BM@N data processing occurs in two stages (first, RAW → DIGI format conversion, followed by DIGI → DST conversion), two task triggers must be defined: one for the arrival of RAW files and another for DIGI files. Automating the processing pipeline enables rapid feedback on the experimental data quality, allowing for timely Data Quality monitoring and issue resolution.
-
Seminar topic: “Digital technology map: detectors, accelerators, competencies”
Speaker: Anna Ilina
Abstract:
JINR has accumulated extensive experience in the development of detector and accelerator systems, related equipment and collaboration with industrial and scientific partners. However, the lack of a centralized knowledge base made it difficult to find information on existing technologies, competencies, and suppliers, limiting the exchange of experience between the departments. To address this issue, a web service was developed and subsequently included in the services of the JINR Digital EcoSystem. It enables the registration and contextual search of data on the Institute’s equipment, materials, technologies, and accumulated competencies. The service also provides data visualization for JINR employees and integrates information from different scientific groups and departments. The project was created by young researchers, IT specialists, and scientists.
-
Seminar topic: “Collection and systematization of scientific publications for the JINR digital repository”
Speaker: Andrey Kondratyev
Abstract:
The relevance of digital repositories of publications as information systems that ensure the availability of scientific research results cannot be overestimated today. The development and modernization of their functionality for the automated collection of bibliographic metadata is relevant. At JINR, the lack of institutional repository structures allows finding solutions to this problem. Effective access to up-to-date information on employees of scientific publications related to JINR is very important for assessing the intellectual potential of the Institute. Automated systems allow reducing duplication and manual data entry in publications, limiting access to scientific information and increasing the efficiency of its analysis. A modern repository integrates data from verified data sources into a single system, provides long-term storage and convenient access to the Institute’s information assets.