{"id":513,"date":"2025-11-25T09:32:05","date_gmt":"2025-11-25T01:32:05","guid":{"rendered":"https:\/\/high-flyer.in.suopu.cc\/?page_id=513"},"modified":"2025-11-25T10:26:50","modified_gmt":"2025-11-25T02:26:50","slug":"blog","status":"publish","type":"page","link":"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/","title":{"rendered":"Blog"},"content":{"rendered":"<div data-elementor-type=\"wp-page\" data-elementor-id=\"513\" class=\"elementor elementor-513\" data-elementor-post-type=\"page\">\n\t\t\t\t<div class=\"elementor-element elementor-element-0012ab2 e-flex e-con-boxed e-con e-parent\" data-id=\"0012ab2\" data-element_type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-1fe1326 elementor-widget elementor-widget-heading\" data-id=\"1fe1326\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">HIGH-FLYER | AI BLOG<\/h2>\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-5909ce8 e-con-full e-flex e-con e-child\" data-id=\"5909ce8\" data-element_type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t<div class=\"elementor-element elementor-element-3906f7d elementor-widget elementor-widget-heading\" data-id=\"3906f7d\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">New Releases<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3bcdef3 elementor-grid-1 elementor-grid-tablet-1 elementor-posts--thumbnail-left blog-top-post elementor-grid-mobile-1 elementor-widget elementor-widget-posts\" data-id=\"3bcdef3\" data-element_type=\"widget\" data-settings=\"{&quot;classic_columns&quot;:&quot;1&quot;,&quot;classic_columns_tablet&quot;:&quot;1&quot;,&quot;classic_columns_mobile&quot;:&quot;1&quot;,&quot;classic_row_gap&quot;:{&quot;unit&quot;:&quot;px&quot;,&quot;size&quot;:35,&quot;sizes&quot;:[]},&quot;classic_row_gap_tablet&quot;:{&quot;unit&quot;:&quot;px&quot;,&quot;size&quot;:&quot;&quot;,&quot;sizes&quot;:[]},&quot;classic_row_gap_mobile&quot;:{&quot;unit&quot;:&quot;px&quot;,&quot;size&quot;:&quot;&quot;,&quot;sizes&quot;:[]}}\" data-widget_type=\"posts.classic\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-posts-container elementor-posts elementor-posts--skin-classic elementor-grid\" role=\"list\">\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-504 post type-post status-publish format-standard has-post-thumbnail hentry category-basic-research\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/504\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img fetchpriority=\"high\" decoding=\"async\" width=\"768\" height=\"300\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-768x300.jpg\" class=\"attachment-medium_large size-medium_large wp-image-505\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-768x300.jpg 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-300x117.jpg 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-1024x400.jpg 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-150x59.jpg 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12.jpg 1080w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/504\/\" >\n\t\t\t\tFlashAttention: A Novel Attention Algorithm with IO Awareness, Fast and Memory-Efficient\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>At the heart of the Transformer model is the self-attention mechanism, which has both time and storage complexity at the O(N2)O(N2) level in terms of sequence length. As the scale of large language models (LLMs) continues to grow, equipping LLMs with longer contextual backgrounds poses a significant engineering implementation challenge. A team of researchers from the Department of Computer Science at Stanford University and the State University of New York at Buffalo has published a novel attention algorithm called FlashAttention, which not only has a longer context than PyT<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<\/div>\n\t\t\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-cf821d2 e-flex e-con-boxed e-con e-parent\" data-id=\"cf821d2\" data-element_type=\"container\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-7907605 elementor-widget elementor-widget-heading\" data-id=\"7907605\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">categorization<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0bb47c1 elementor-nav-menu--dropdown-none blog-category-nav elementor-widget elementor-widget-nav-menu\" data-id=\"0bb47c1\" data-element_type=\"widget\" data-settings=\"{&quot;layout&quot;:&quot;horizontal&quot;,&quot;submenu_icon&quot;:{&quot;value&quot;:&quot;&lt;svg aria-hidden=\\&quot;true\\&quot; class=\\&quot;e-font-icon-svg e-fas-caret-down\\&quot; viewBox=\\&quot;0 0 320 512\\&quot; xmlns=\\&quot;http:\\\/\\\/www.w3.org\\\/2000\\\/svg\\&quot;&gt;&lt;path d=\\&quot;M31.3 192h257.3c17.8 0 26.7 21.5 14.1 34.1L174.1 354.8c-7.8 7.8-20.5 7.8-28.3 0L17.2 226.1C4.6 213.5 13.5 192 31.3 192z\\&quot;&gt;&lt;\\\/path&gt;&lt;\\\/svg&gt;&quot;,&quot;library&quot;:&quot;fa-solid&quot;}}\" data-widget_type=\"nav-menu.default\">\n\t\t\t\t\t\t\t\t<nav aria-label=\"Menu\" class=\"elementor-nav-menu--main elementor-nav-menu__container elementor-nav-menu--layout-horizontal e--pointer-none\">\n\t\t\t\t<ul id=\"menu-1-0bb47c1\" class=\"elementor-nav-menu\"><li class=\"menu-item menu-item-type-post_type menu-item-object-page menu-item-541\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/\" class=\"elementor-item\">ALL<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-508\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/basic-research\/\" class=\"elementor-item\">basic research<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-511\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/firefly-platform\/\" class=\"elementor-item\">Firefly Platform<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-512\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/firefly-run-model\/\" class=\"elementor-item\">Firefly Running Model<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-507\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/hfai-mantra\/\" class=\"elementor-item\">hfai Mindfulness<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-509\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/infrastructure\/\" class=\"elementor-item\">infrastructure<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-510\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/parallel-optimization\/\" class=\"elementor-item\">parallel optimization<\/a><\/li>\n<\/ul>\t\t\t<\/nav>\n\t\t\t\t\t\t<nav class=\"elementor-nav-menu--dropdown elementor-nav-menu__container\" aria-hidden=\"true\">\n\t\t\t\t<ul id=\"menu-2-0bb47c1\" class=\"elementor-nav-menu\"><li class=\"menu-item menu-item-type-post_type menu-item-object-page menu-item-541\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/\" class=\"elementor-item\" tabindex=\"-1\">ALL<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-508\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/basic-research\/\" class=\"elementor-item\" tabindex=\"-1\">basic research<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-511\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/firefly-platform\/\" class=\"elementor-item\" tabindex=\"-1\">Firefly Platform<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-512\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/firefly-run-model\/\" class=\"elementor-item\" tabindex=\"-1\">Firefly Running Model<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-507\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/hfai-mantra\/\" class=\"elementor-item\" tabindex=\"-1\">hfai Mindfulness<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-509\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/infrastructure\/\" class=\"elementor-item\" tabindex=\"-1\">infrastructure<\/a><\/li>\n<li class=\"menu-item menu-item-type-taxonomy menu-item-object-category menu-item-510\"><a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/parallel-optimization\/\" class=\"elementor-item\" tabindex=\"-1\">parallel optimization<\/a><\/li>\n<\/ul>\t\t\t<\/nav>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-23c9907 e-flex e-con-boxed e-con e-parent\" data-id=\"23c9907\" data-element_type=\"container\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-dcfc64c elementor-grid-2 blog-bottom-post elementor-grid-tablet-2 elementor-grid-mobile-1 elementor-posts--thumbnail-top elementor-widget elementor-widget-posts\" data-id=\"dcfc64c\" data-element_type=\"widget\" data-settings=\"{&quot;classic_columns&quot;:&quot;2&quot;,&quot;classic_row_gap&quot;:{&quot;unit&quot;:&quot;px&quot;,&quot;size&quot;:20,&quot;sizes&quot;:[]},&quot;pagination_type&quot;:&quot;numbers_and_prev_next&quot;,&quot;classic_columns_tablet&quot;:&quot;2&quot;,&quot;classic_columns_mobile&quot;:&quot;1&quot;,&quot;classic_row_gap_tablet&quot;:{&quot;unit&quot;:&quot;px&quot;,&quot;size&quot;:&quot;&quot;,&quot;sizes&quot;:[]},&quot;classic_row_gap_mobile&quot;:{&quot;unit&quot;:&quot;px&quot;,&quot;size&quot;:&quot;&quot;,&quot;sizes&quot;:[]}}\" data-widget_type=\"posts.classic\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-posts-container elementor-posts elementor-posts--skin-classic elementor-grid\" role=\"list\">\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-504 post type-post status-publish format-standard has-post-thumbnail hentry category-basic-research\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/504\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img fetchpriority=\"high\" decoding=\"async\" width=\"768\" height=\"300\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-768x300.jpg\" class=\"attachment-medium_large size-medium_large wp-image-505\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-768x300.jpg 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-300x117.jpg 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-1024x400.jpg 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12-150x59.jpg 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-12.jpg 1080w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/504\/\" >\n\t\t\t\tFlashAttention: A Novel Attention Algorithm with IO Awareness, Fast and Memory-Efficient\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>At the heart of the Transformer model is the self-attention mechanism, which has both time and storage complexity at the O(N2)O(N2) level in terms of sequence length. As the scale of large language models (LLMs) continues to grow, equipping LLMs with longer contextual backgrounds poses a significant engineering implementation challenge. A team of researchers from the Department of Computer Science at Stanford University and the State University of New York at Buffalo has published a novel attention algorithm called FlashAttention, which not only has a longer context than PyT<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-501 post type-post status-publish format-standard has-post-thumbnail hentry category-parallel-optimization\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/501\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img decoding=\"async\" width=\"768\" height=\"299\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-768x299.png\" class=\"attachment-medium_large size-medium_large wp-image-502\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-768x299.png 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-300x117.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-1024x399.png 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-150x58.png 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11.png 1200w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/501\/\" >\n\t\t\t\tPyTorch Distributed Training Method\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>In 2018, the Bert model with nearly 300 million parameters came out of nowhere, pushing the NLP field to new heights. In recent years, the development of the artificial intelligence field has increasingly tended to the study of large models, and all major AI giants have released their large models with hundreds of billions of parameters, giving birth to many new AI application scenarios. On the other hand, a variety of factors continue to drive the significant development of big models: 1) society is experiencing a deep digital transformation, and a large amount of data is gradually merging, giving rise to many AI application scenarios and needs; 2) hardware technology continues to advance: the NVIDIA A100 GPU, Go<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-498 post type-post status-publish format-standard has-post-thumbnail hentry category-parallel-optimization\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/498\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img decoding=\"async\" width=\"750\" height=\"300\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-10.png\" class=\"attachment-medium_large size-medium_large wp-image-499\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-10.png 750w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-10-300x120.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-10-150x60.png 150w\" sizes=\"(max-width: 750px) 100vw, 750px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/498\/\" >\n\t\t\t\tAlphafold Training Optimization 01 | Data Processing Optimization\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>If there is one of the most exciting achievements in AI academia in 2021, then Alphafold deserves the title.Alphafold2 achieved far greater accuracy than comparable models on the CASP14 protein prediction challenge, and for the first time ever improved the accuracy of protein structure prediction to the atomic level-which already close to the level of experimental measurements. The Phantom AI team successfully trained and ran Alphafold2 on the Firefly 2 platform shortly after the launch of Alphafold2, as described in our previous article, \"Firefly Running Models | Alphafold\". <\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-495 post type-post status-publish format-standard has-post-thumbnail hentry category-parallel-optimization\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/495\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"768\" height=\"335\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-9-768x335.png\" class=\"attachment-medium_large size-medium_large wp-image-496\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-9-768x335.png 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-9-300x131.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-9-1024x447.png 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-9-150x65.png 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-9.png 1185w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/495\/\" >\n\t\t\t\tAlphafold Training Optimization 02 | Multi-Card Training Speedup\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>As mentioned in the previous article, Phantom AI has improved the overall training performance of Alphafold by optimizing data processing, using both feature preprocessing and feature cropping. As we all know, Mirage AI has many parallel training gas pedals, such as hfreduce, 3FS, hfai.nn arithmetic library, etc., can they further accelerate the overall training of Alphafold? In this issue, we will experiment with these questions. hfreduce As mentioned in the previous article \"Phantom Power | Model Parallel Training Tool: hfreduce\", due to the Phantom AI architecture<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-492 post type-post status-publish format-standard has-post-thumbnail hentry category-parallel-optimization\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/492\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"380\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-8.png\" class=\"attachment-medium_large size-medium_large wp-image-493\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-8.png 750w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-8-300x152.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-8-150x76.png 150w\" sizes=\"(max-width: 750px) 100vw, 750px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/492\/\" >\n\t\t\t\tAlphafold Training Optimization 03 | Pitfall Diary\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>The previous two issues of the article showed the optimization of Alphafold by Phantom AI, using feature preprocessing and feature cropping to improve the performance of Alphafold data processing, further improving the training speed of the model through parallel training acceleration artifacts, and deeply integrating Alphafold into the characteristics of Phantom AI's clusters, to play the maximum computational efficiency. From an overall point of view, what else do we need to pay attention to when training Alphafold on Phantom Firefly II, and how to optimize the same type of deep learning model in the future? On these topics, this article will talk to you about Phantom Cube AI's<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-489 post type-post status-publish format-standard has-post-thumbnail hentry category-infrastructure\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/489\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"768\" height=\"514\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-7-768x514.png\" class=\"attachment-medium_large size-medium_large wp-image-490\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-7-768x514.png 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-7-300x201.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-7-1024x685.png 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-7-150x100.png 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-7.png 1200w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/489\/\" >\n\t\t\t\t3FS Optimization 01 | Server-Side Optimization\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>As introduced in the article \"Phantom Power | High Speed File Series 3FS\", Phantom AI has designed a sample reading file system, 3FS, which is very suitable for deep learning training. 3FS, which uses Direct IO and RDMA Read, allows model training to obtain high read bandwidth in the sample reading part with minimal CPU and memory overhead, thus eliminating the need to wait for loading data during the training process and more fully utilizing GPU performance. This eliminates the need to wait for data to be loaded during the training process and more fully utilizes the computational performance of the GPU. As we know, file systems are generally categorized into client-side and server-side. In the 3FS file system<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-486 post type-post status-publish format-standard has-post-thumbnail hentry category-infrastructure\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/486\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"554\" height=\"129\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-6.png\" class=\"attachment-medium_large size-medium_large wp-image-487\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-6.png 554w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-6-300x70.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-6-150x35.png 150w\" sizes=\"(max-width: 554px) 100vw, 554px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/486\/\" >\n\t\t\t\t3FS Optimization 02 | Client Memory Usage Optimization\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>As introduced in the article \"Phantom Power | High-Speed File Series 3FS\", Phantom AI has designed a sample read file system, 3FS, that is ideal for deep learning training. 3FS, which adopts Direct IO and RDMA Read, allows model training to obtain ultra-high read bandwidth in the sample read portion of the program with minimal CPU and memory overhead, thus eliminating the need to wait for data to load during the training process. This eliminates the need to wait for data to be loaded during the training process, and more fully utilizes the computational performance of the GPU. As we know, the file system is generally divided into client-side and server-side. In the 3FS file system, the client part<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-483 post type-post status-publish format-standard has-post-thumbnail hentry category-infrastructure\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/483\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"768\" height=\"513\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-5-768x513.png\" class=\"attachment-medium_large size-medium_large wp-image-484\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-5-768x513.png 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-5-300x200.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-5-1024x684.png 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-5-150x100.png 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-5.png 1200w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/483\/\" >\n\t\t\t\t3FS Optimization 03 | Data Read Mode Adaptation\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>As introduced in the article \"Phantom Power | High-Speed File Series 3FS\", Phantom AI has designed a sample read file system, 3FS, which is ideal for deep learning training. 3FS uses Direct IO and RDMA Read, allowing model training to get high read bandwidth in the sample read portion of the program with minimal CPU and memory overhead, eliminating the need to wait for data to load during the training process and more fully utilizing GPU performance. This eliminates the need to wait for data to be loaded during the training process and more fully utilizes the GPU's computational performance. However, in practice, there are many problems that we did not anticipate, such as the problem of inter-task interactions.<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-480 post type-post status-publish format-standard has-post-thumbnail hentry category-basic-research\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/480\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"768\" height=\"336\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-4-768x336.png\" class=\"attachment-medium_large size-medium_large wp-image-481\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-4-768x336.png 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-4-300x131.png 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-4-1024x448.png 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-4-150x66.png 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-4.png 1080w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/480\/\" >\n\t\t\t\tA bit of our practice in reducing network congestion (I)\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>For deep learning developers and researchers, high-performance computing power is an important weapon to help their research succeed. As for the factors affecting the speed of deep learning training, it is often easy to overlook the important role of network transmission in speeding up training. Especially in large-scale clusters, distributed training scenarios, network congestion may directly lead to the failure of the GPU computing power, just as there is a section of two-way 8-lane expressway, but if the road planning is messy, the highway can only be reduced to a large parking lot. In this issue, we share a little bit of Phantom AI's thinking and optimization in this direction on the topic of network. First, let's talk about network<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<article class=\"elementor-post elementor-grid-item post-477 post type-post status-publish format-standard has-post-thumbnail hentry category-hfai-mantra\" role=\"listitem\">\n\t\t\t\t<a class=\"elementor-post__thumbnail__link\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/477\/\" tabindex=\"-1\" >\n\t\t\t<div class=\"elementor-post__thumbnail\"><img loading=\"lazy\" decoding=\"async\" width=\"768\" height=\"287\" src=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-768x287.jpg\" class=\"attachment-medium_large size-medium_large wp-image-478\" alt=\"\" srcset=\"https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-768x287.jpg 768w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-300x112.jpg 300w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-1024x383.jpg 1024w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11-150x56.jpg 150w, https:\/\/high-flyer.in.suopu.cc\/wp-content\/uploads\/2025\/11\/banner-11.jpg 1080w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/div>\n\t\t<\/a>\n\t\t\t\t<div class=\"elementor-post__text\">\n\t\t\t\t<h3 class=\"elementor-post__title\">\n\t\t\t<a href=\"https:\/\/high-flyer.in.suopu.cc\/en\/blog\/477\/\" >\n\t\t\t\thfai python | Task submission at will, Firefly training on the fly\t\t\t<\/a>\n\t\t<\/h3>\n\t\t\t\t<div class=\"elementor-post__meta-data\">\n\t\t\t\t\t<span class=\"elementor-post-date\">\n\t\t\t2025-11-25\t\t<\/span>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-post__excerpt\">\n\t\t\t<p>Phantom AI released its deep learning suite hfai, which has been in use for many years, and attracted many peer researchers and developers to inquire about trying it out. The entire suite has many features, and by familiarizing yourself with this set of rules, you can easily call up the platform's arithmetic resources to efficiently complete training tasks. For this reason, we have created a series of albums called \u201chfai's Methods of Use\u201d to introduce the design ideas and principles of some of hfai's features, so that you can learn them better and faster, and be comfortable with hfai's set of \u201cmagical skills\u201d. To cope with the challenges of deep learning assignments<\/p>\n\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/article>\n\t\t\t\t<\/div>\n\t\t\n\t\t\t\t<div class=\"e-load-more-anchor\" data-page=\"1\" data-max-page=\"3\" data-next-page=\"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513\/page\/2\/\"><\/div>\n\t\t\t\t<nav class=\"elementor-pagination\" aria-label=\"Pagination\">\n\t\t\t<span class=\"page-numbers prev\">&lt; Previous page<\/span>\n<span aria-current=\"page\" class=\"page-numbers current\"><span class=\"elementor-screen-only\">Page<\/span>1<\/span>\n<a class=\"page-numbers\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513\/page\/2\/\"><span class=\"elementor-screen-only\">Page<\/span>2<\/a>\n<a class=\"page-numbers\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513\/page\/3\/\"><span class=\"elementor-screen-only\">Page<\/span>3<\/a>\n<a class=\"page-numbers next\" href=\"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513\/page\/2\/\">Next &gt;<\/a>\t\t<\/nav>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>","protected":false},"excerpt":{"rendered":"<p>HIGH-FLYER | AI BLOG \u6700\u65b0\u53d1\u5e03 \u5206\u7c7b<\/p>","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"elementor_header_footer","meta":{"_acf_changed":false,"footnotes":""},"class_list":["post-513","page","type-page","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/comments?post=513"}],"version-history":[{"count":58,"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513\/revisions"}],"predecessor-version":[{"id":578,"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/pages\/513\/revisions\/578"}],"wp:attachment":[{"href":"https:\/\/high-flyer.in.suopu.cc\/en\/wp-json\/wp\/v2\/media?parent=513"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}