当前位置: 首页 > AI > 文章内容页

【ai创造营】电商知识图谱链接预测

时间:2025-07-25    作者:游乐小编    

知识图谱是AI时代一项非常重要的技术,然而知识图谱普遍存在不完备的问题,本任务旨在提升电商场景下知识图谱嵌入效果,满足商品推荐等应用对推理商品潜在关联性的需求。

【ai创造营】电商知识图谱链接预测 - 游乐网

电商知识图谱链接预测

知识图谱是AI时代一项非常重要的技术,然而知识图谱普遍存在不完备的问题,知识图谱链接预测任务主要基于实体和关系的表示对缺失三元组进行预测。本任务旨在提升电商场景下知识图谱嵌入效果,满足商品推荐等应用对推理商品潜在关联性的需求。

本项目主要介绍一下TransE的算法原理,以及使用PGL来实现TransE在OpenBG500上的的训练和推理,并会给出其DistMult,ComplEx,RotatE,OTE算法之间训练效果对比。

【ai创造营】电商知识图谱链接预测 - 游乐网        
图1 OpenBG部分三元组可视化

数据集介绍

OpenBG Benchmark 是一个大规模开放数字商业知识图谱评测基准,包含多个子数据集任务。数据集以开放的数字商业知识图谱 OpenBG[2]为基础构建,OpenBG 是开放的数字商业知识图谱,是一个使用统一 Schema 组织、涵盖产品和消费需求的百万级多模态数据集。OpenBG 由阿里巴巴藏经阁团队和浙江大学提供,开放的目标是利用开放的商业知识发现社会经济的价值,促进数字商务数字经济等领域的交叉学科研究,服务数字经济健康发展的国家战略需求。首期开放包含三大类任务,本项目实现其中的商品关系推理和链接预测

商品关系推理与链接预测

任务描述

由于知识图谱普遍存在不完整的问题,因此需要关系推理与链接预测技术对缺失的图谱节点进行预测。本任务旨在提升数字商业场景下知识图谱嵌入效果,满足商品推荐等应用对推理商品潜在关联性的需求。

任务说明

知识图谱一般通过三元组(h,r,t)的形式组织数据,其中 h 被称为头实体,t 为尾实体,r 为连接头、尾实体的关系。如下图所示(“化妆棉”,“品牌”,“屈臣氏”)就是一个图谱三元组。知识图谱的链接预测任务指的是已知头实体(或尾实体)和关系的情况下,预测缺失的尾实体(或头实体)。下图中,(“化妆棉”,“适用群体”,?)就是一个链接预测任务,需要预测出尾实体。

数据集

与商品常识显著性推理以及同款挖掘任务不同的是,链接预测任务由 3 个子任务数据集组成:OpenBG500、OpenBG500-L 和 OpenBG-IMG。其中 OpenBG500 包含 500 类关系,含百万级别规模的图谱数据;OpenBG500-L 在 OpenBG500 的基础上扩大了数据规模,含千万级别规模的图谱数据,是电子商务领域大规模的知识图谱;OpenBG-IMG 是电商领域的多模态知识图谱。3 个数据集均以 OpenBG 为基础构建,构建流程如下:

【ai创造营】电商知识图谱链接预测 - 游乐网        
图4 任务3描述

前期准备

In [2]
# 项目开始环境准备# 项目开始运行时执行一次即可%cd /home/aistudio/!unzip  -q -d data/OpenBG500 data/data177429/OpenBG500.zip# # 拉取PGL# # 1. 从github拉取项目,github网络可能无法连接,可以多重复试几次,或者换用gitee(不过gitee上的是两年前的版本,可能有问题)# # gittee# !git clone https://gitee.com/paddlepaddle/PGL.git# # github# !git clone git://github.com/PaddlePaddle/PGL.git# # 2. github不可用时,可以把项目拉到本地,然后上传压缩包,解压项目压缩包# !unzip -d PGL 0121d96a5ffb385024f8ba13285da5880dd2753c.zip!unzip -q /home/aistudio/PGL.zip!mv /home/aistudio/home/aistudio/data/PGL /home/aistudio# # 进入PGL目录,安装所需依赖# %cd PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c# !pip install -r requirements.txt# # # 进入apps目录# %cd apps# 使用networkx绘图,查看图的大体情况!pip install --upgrade numpy!pip install networkx# 安装paddle的graph相关的包!pip install pgl
登录后复制        
/home/aistudioLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleRequirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.19.5)Collecting numpy  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6d/ad/ff3b21ebfe79a4d25b4a4f8e5cf9fd44a204adb6b33c09010f566f51027a/numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.7/15.7 MB 2.4 MB/s eta 0:00:0000:0100:01Installing collected packages: numpy  Attempting uninstall: numpy    Found existing installation: numpy 1.19.5    Uninstalling numpy-1.19.5:      Successfully uninstalled numpy-1.19.5ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.parl 1.4.1 requires pyzmq==18.1.1, but you have pyzmq 23.2.1 which is incompatible.Successfully installed numpy-1.21.6[notice] A new release of pip available: 22.1.2 -> 22.3.1[notice] To update, run: pip install --upgrade pipLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleRequirement already satisfied: networkx in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (2.4)Requirement already satisfied: decorator>=4.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from networkx) (4.4.2)[notice] A new release of pip available: 22.1.2 -> 22.3.1[notice] To update, run: pip install --upgrade pipLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleCollecting pgl  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e2/86/f32837dff03a494d6a3b3e9f578c3e12df32e05ddb389a47a02fbd1f9455/pgl-2.2.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.2 MB)     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.2/9.2 MB 1.7 MB/s eta 0:00:0000:0100:01Requirement already satisfied: numpy>=1.16.4 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (1.21.6)Requirement already satisfied: cython>=0.25.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pgl) (0.29)Installing collected packages: pglSuccessfully installed pgl-2.2.4[notice] A new release of pip available: 22.1.2 -> 22.3.1[notice] To update, run: pip install --upgrade pip
登录后复制        

查看数据集大体情况

In [3]
# 查看训练集大致情况import pandas as pdbg_train = pd.read_csv(    '/home/aistudio/data/OpenBG500/OpenBG500_train.tsv',     sep='\t',    index_col=False,     header=None)print("=======查看训练集的前五行=========")print(bg_train.head())print("\n\n=======查看训练集的大致情况========")print(bg_train.describe())
登录后复制        
=======查看训练集的前五行=========            0         1           20  ent_135492  rel_0352  ent_0156511  ent_020765  rel_0448  ent_2141832  ent_106905  rel_0418  ent_1210733  ent_098167  rel_0343  ent_0904024  ent_155261  rel_0225  ent_100806=======查看训练集的大致情况========                 0         1           2count      1242550   1242550     1242550unique      116721       500      133025top     ent_172515  rel_0418  ent_109153freq          2208    416742       42649
登录后复制        In [14]
# 数据预处理,# 因为pgl支持处理txt和dict格式的数据,所以需要自己编写一个脚本来把OpenBG的格式转换为pgl支持的格式!python /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG/dataset/convertor.py
登录后复制        
file saved at /home/aistudio/data/OpenBG500/entities.dictfile saved at /home/aistudio/data/OpenBG500/relations.dictfile saved at /home/aistudio/data/OpenBG500/train.txtfile saved at /home/aistudio/data/OpenBG500/test.txtfile saved at /home/aistudio/data/OpenBG500/valid.txt
登录后复制        In [1]
import numpy as npimport networkx as nximport matplotlib.pyplot as plt# 随机从训练集中选取1000(太多了networkx可能报错)条三元组,查看图的大体情况# 加载关系字典和实体字典def get_dict():    rel_dict = dict()    ent_dict = dict()    with open('/home/aistudio/data/OpenBG500/relations.dict', 'r') as f:        lines = f.readlines()        for line in lines:            k, v = line.strip().split('\t')            rel_dict[k] = v    with open('/home/aistudio/data/OpenBG500/entities.dict', 'r') as f:        lines = f.readlines()        for line in lines:            k, v = line.strip().split('\t')            ent_dict[k] = v    return rel_dict, ent_dictrel_dict, ent_dict = get_dict()# 选取一部分数据sample_num = 4000s = np.random.randint(0, 1242550-sample_num)data = []with open('/home/aistudio/data/OpenBG500/train.txt', 'r') as f:    data = f.readlines()    data = data[s:s+sample_num]# 实体和关系转换为idsdef to_ids(data, rel_dict, ent_dict):    data = data.strip().split('\t')    return ent_dict[data[0]], ent_dict[data[2]], rel_dict[data[1]]data = [to_ids(i, rel_dict, ent_dict) for i in data]plt.figure(figsize=(10, 10), dpi=200)graph = nx.Graph()graph.add_weighted_edges_from(data)node_color = np.concatenate([np.linspace(0, 1, sample_num)[:, None], np.zeros([sample_num, 2])], axis=1)# 绘图的配置options = {    'node_color': 'black',    'node_size': 10,    'width': 3,    'node_color': node_color,    'width': 0.5}nx.draw(graph, **options)plt.savefig('/home/aistudio/result/images/graph.png')plt.show()
登录后复制        
登录后复制                

训练

表示学习旨在学习一系列低维稠密向量来表征语义信息,而知识表示学习是面向知识库中实体和关系的表示学习。当今大规模知识库(或称知识图谱)的构建为许多NLP任务提供了底层支持,但由于其规模庞大且不完备,如何高效存储和补全知识库成为了一项非常重要的任务,这就依托于知识表示学习。

transE算法就是一个非常经典的知识表示学习,用分布式表示(distributed representation)来描述知识库中的三元组。想象一下,这类表示法既避免了庞大的树结构构造,又能通过简单的数学计算获取语义信息,因此成为了当前表示学习的根基。

transE算法流程如下:

【ai创造营】电商知识图谱链接预测 - 游乐网        
图5 TransE算法伪代码
In [6]
# 训练模型,可以通过改变model_name来换用不同的模型# 不同模型的相关参数可以参考/home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG/models里的sh文件# 参数的含义可以参考 /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG/config.py# 进入项目文件夹%cd /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG/
登录后复制        
/home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG
登录后复制登录后复制        

训练结果

各个评价指标含义:

MRR:MRR的全称是Mean Reciprocal Ranking,即平均倒数排名。具体的计算方法如下:

MRR=1Si=1S1rankiMRR=∣S∣1i=1∑∣S∣ranki1

MR:MR的全称是Mean Rank。具体的计算方法如下:

MR=1Si=1SrankiMR=∣S∣1i=1∑∣S∣ranki

求解思路和MRR相同,就是把倒数排名换成不是倒数排名,MR值越小越好。

HIT@n:该指标是指在链接预测中排名小于等于n的三元组的平均占比。具体的计算方法如下:

HITS@n=1Si=1SI(rankin)HITS@n=∣S∣1i=1∑∣S∣I(ranki≤n)

符号与上述一样,另外I()I(⋅)是indicator函数(若条件真则函数值为1,否则为0)。一般地,取n为1、3或者10,HITS@n指标越大越好

训练TransE

In [5]
# 训练TransE!python -u train.py --model_name TransE \                    --data_name  OpenBG500\                    --data_path  /home/aistudio/data/\                    --save_path /home/aistudio/result/transe \                    --batch_size 1000 --test_batch_size 16 \                    --log_interval 1000 --eval_interval 24000 \                    --reg_coef 1e-9 --reg_norm 3 \                    --neg_sample_size 200 --neg_sample_type 'chunk' \                    --embed_dim 400 --gamma 19.9 -adv \                    --num_workers 8 --num_epoch 30 \                    --print_on_screen --filter_eval --lr 0.25 \                    --optimizer adagrad --valid
登录后复制        
----------------------------------------        Device Setting        ---------------------------------------- Entity   embedding place: gpu Relation embedding place: gpu--------------------------------------------------------------------------------       Embedding Setting      ---------------------------------------- Entity   embedding dimension: 400 Relation embedding dimension: 400----------------------------------------2024-12-02 17:59:27,276 INFO     seed                :02024-12-02 17:59:27,276 INFO     data_path           :/home/aistudio/data/2024-12-02 17:59:27,276 INFO     save_path           :/home/aistudio/result/transe/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE2024-12-02 17:59:27,276 INFO     init_from_ckpt      :None2024-12-02 17:59:27,276 INFO     data_name           :OpenBG5002024-12-02 17:59:27,276 INFO     use_dict            :False2024-12-02 17:59:27,276 INFO     kv_mode             :False2024-12-02 17:59:27,276 INFO     batch_size          :10002024-12-02 17:59:27,276 INFO     test_batch_size     :162024-12-02 17:59:27,277 INFO     neg_sample_size     :2002024-12-02 17:59:27,277 INFO     filter_eval         :True2024-12-02 17:59:27,277 INFO     model_name          :transe2024-12-02 17:59:27,277 INFO     embed_dim           :4002024-12-02 17:59:27,277 INFO     reg_coef            :1e-092024-12-02 17:59:27,277 INFO     loss_type           :Logsigmoid2024-12-02 17:59:27,277 INFO     max_steps           :20000002024-12-02 17:59:27,277 INFO     lr                  :0.252024-12-02 17:59:27,277 INFO     optimizer           :adagrad2024-12-02 17:59:27,277 INFO     cpu_lr              :0.12024-12-02 17:59:27,277 INFO     cpu_optimizer       :adagrad2024-12-02 17:59:27,277 INFO     mix_cpu_gpu         :False2024-12-02 17:59:27,277 INFO     async_update        :False2024-12-02 17:59:27,277 INFO     valid               :True2024-12-02 17:59:27,277 INFO     test                :False2024-12-02 17:59:27,277 INFO     task_name           :KGE2024-12-02 17:59:27,277 INFO     num_workers         :82024-12-02 17:59:27,277 INFO     neg_sample_type     :chunk2024-12-02 17:59:27,277 INFO     neg_deg_sample      :False2024-12-02 17:59:27,277 INFO     neg_adversarial_sampling:True2024-12-02 17:59:27,277 INFO     adversarial_temperature:1.02024-12-02 17:59:27,277 INFO     filter_sample       :False2024-12-02 17:59:27,277 INFO     valid_percent       :1.02024-12-02 17:59:27,277 INFO     use_feature         :False2024-12-02 17:59:27,277 INFO     reg_type            :norm_er2024-12-02 17:59:27,278 INFO     reg_norm            :32024-12-02 17:59:27,278 INFO     weighted_loss       :False2024-12-02 17:59:27,278 INFO     margin              :1.02024-12-02 17:59:27,278 INFO     pairwise            :False2024-12-02 17:59:27,278 INFO     gamma               :19.92024-12-02 17:59:27,278 INFO     ote_scale           :02024-12-02 17:59:27,278 INFO     ote_size            :12024-12-02 17:59:27,278 INFO     quate_lmbda1        :0.02024-12-02 17:59:27,278 INFO     quate_lmbda2        :0.02024-12-02 17:59:27,278 INFO     num_epoch           :302024-12-02 17:59:27,278 INFO     scheduler_interval  :-12024-12-02 17:59:27,278 INFO     num_process         :12024-12-02 17:59:27,278 INFO     print_on_screen     :True2024-12-02 17:59:27,278 INFO     log_interval        :10002024-12-02 17:59:27,278 INFO     save_interval       :-12024-12-02 17:59:27,278 INFO     eval_interval       :240002024-12-02 17:59:27,278 INFO     ent_emb_on_cpu      :False2024-12-02 17:59:27,278 INFO     rel_emb_on_cpu      :False2024-12-02 17:59:27,278 INFO     use_embedding_regularization:True2024-12-02 17:59:27,278 INFO     ent_dim             :4002024-12-02 17:59:27,278 INFO     rel_dim             :4002024-12-02 17:59:27,278 INFO     num_chunks          :5W1202 17:59:42.894878   950 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1202 17:59:42.897931   950 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2./opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method.  DeprecationWarning,2024-12-02 17:59:56,356 INFO     step: 999, loss: 0.92951, reg: 1.1653e-03, speed: 83.88 steps/s, time: 11.92 s2024-12-02 17:59:56,356 INFO     sample: 1.688623, forward: 4.919557, backward: 0.766675, update: 4.5369722024-12-02 18:00:07,572 INFO     step: 1999, loss: 0.51847, reg: 1.6318e-03, speed: 89.16 steps/s, time: 11.22 s2024-12-02 18:00:07,572 INFO     sample: 1.472965, forward: 4.039794, backward: 0.771991, update: 4.9217002024-12-02 18:00:18,513 INFO     step: 2999, loss: 0.38514, reg: 1.7626e-03, speed: 91.40 steps/s, time: 10.94 s2024-12-02 18:00:18,513 INFO     sample: 1.473510, forward: 4.081738, backward: 0.748087, update: 4.6280822024-12-02 18:00:29,354 INFO     step: 3999, loss: 0.32582, reg: 1.8105e-03, speed: 92.25 steps/s, time: 10.84 s2024-12-02 18:00:29,354 INFO     sample: 1.541132, forward: 4.029341, backward: 0.737082, update: 4.5242872024-12-02 18:00:40,563 INFO     step: 4999, loss: 0.29788, reg: 1.8435e-03, speed: 89.21 steps/s, time: 11.21 s2024-12-02 18:00:40,564 INFO     sample: 1.484886, forward: 4.247042, backward: 0.905907, update: 4.5616742024-12-02 18:00:50,743 INFO     step: 5999, loss: 0.27639, reg: 1.8786e-03, speed: 98.24 steps/s, time: 10.18 s2024-12-02 18:00:50,743 INFO     sample: 0.547168, forward: 4.085115, backward: 0.844820, update: 4.6909172024-12-02 18:01:01,927 INFO     step: 6999, loss: 0.26605, reg: 1.9013e-03, speed: 89.41 steps/s, time: 11.18 s2024-12-02 18:01:01,927 INFO     sample: 1.570992, forward: 4.085207, backward: 0.755627, update: 4.7629352024-12-02 18:01:13,348 INFO     step: 7999, loss: 0.25838, reg: 1.9202e-03, speed: 87.56 steps/s, time: 11.42 s2024-12-02 18:01:13,349 INFO     sample: 1.683276, forward: 3.915311, backward: 0.832062, update: 4.9807732024-12-02 18:01:24,567 INFO     step: 8999, loss: 0.25231, reg: 1.9371e-03, speed: 89.14 steps/s, time: 11.22 s2024-12-02 18:01:24,568 INFO     sample: 1.619009, forward: 3.994691, backward: 0.792235, update: 4.8030782024-12-02 18:01:35,494 INFO     step: 9999, loss: 0.24707, reg: 1.9526e-03, speed: 91.53 steps/s, time: 10.93 s2024-12-02 18:01:35,494 INFO     sample: 1.559962, forward: 4.007938, backward: 0.758157, update: 4.5903302024-12-02 18:01:45,652 INFO     step: 10999, loss: 0.23906, reg: 1.9714e-03, speed: 98.44 steps/s, time: 10.16 s2024-12-02 18:01:45,653 INFO     sample: 0.566743, forward: 4.002622, backward: 0.798780, update: 4.7798582024-12-02 18:01:56,419 INFO     step: 11999, loss: 0.23441, reg: 1.9833e-03, speed: 92.88 steps/s, time: 10.77 s2024-12-02 18:01:56,419 INFO     sample: 1.455285, forward: 4.058579, backward: 0.742973, update: 4.5001602024-12-02 18:02:07,230 INFO     step: 12999, loss: 0.23135, reg: 1.9933e-03, speed: 92.50 steps/s, time: 10.81 s2024-12-02 18:02:07,230 INFO     sample: 1.447748, forward: 4.046167, backward: 0.805724, update: 4.5015692024-12-02 18:02:18,255 INFO     step: 13999, loss: 0.22790, reg: 2.0011e-03, speed: 90.70 steps/s, time: 11.03 s2024-12-02 18:02:18,256 INFO     sample: 1.513487, forward: 4.083624, backward: 0.746703, update: 4.6721632024-12-02 18:02:29,411 INFO     step: 14999, loss: 0.22552, reg: 2.0106e-03, speed: 89.64 steps/s, time: 11.16 s2024-12-02 18:02:29,412 INFO     sample: 1.557806, forward: 3.985478, backward: 0.776811, update: 4.8255412024-12-02 18:02:39,290 INFO     step: 15999, loss: 0.22099, reg: 2.0221e-03, speed: 101.23 steps/s, time: 9.88 s2024-12-02 18:02:39,291 INFO     sample: 0.551149, forward: 4.013759, backward: 0.778107, update: 4.5257842024-12-02 18:02:50,121 INFO     step: 16999, loss: 0.21762, reg: 2.0290e-03, speed: 92.34 steps/s, time: 10.83 s2024-12-02 18:02:50,121 INFO     sample: 1.527163, forward: 3.989723, backward: 0.741708, update: 4.5617252024-12-02 18:03:01,113 INFO     step: 17999, loss: 0.21553, reg: 2.0352e-03, speed: 90.97 steps/s, time: 10.99 s2024-12-02 18:03:01,113 INFO     sample: 1.498610, forward: 4.078822, backward: 0.780088, update: 4.6258802024-12-02 18:03:12,150 INFO     step: 18999, loss: 0.21343, reg: 2.0391e-03, speed: 90.61 steps/s, time: 11.04 s2024-12-02 18:03:12,151 INFO     sample: 1.405494, forward: 4.190880, backward: 0.737650, update: 4.6942162024-12-02 18:03:23,375 INFO     step: 19999, loss: 0.21156, reg: 2.0436e-03, speed: 89.09 steps/s, time: 11.22 s2024-12-02 18:03:23,375 INFO     sample: 1.573719, forward: 4.092272, backward: 0.754539, update: 4.7949512024-12-02 18:03:33,439 INFO     step: 20999, loss: 0.20898, reg: 2.0518e-03, speed: 99.37 steps/s, time: 10.06 s2024-12-02 18:03:33,439 INFO     sample: 0.576577, forward: 3.965927, backward: 0.787168, update: 4.7231142024-12-02 18:03:44,508 INFO     step: 21999, loss: 0.20621, reg: 2.0568e-03, speed: 90.35 steps/s, time: 11.07 s2024-12-02 18:03:44,508 INFO     sample: 1.651846, forward: 3.960945, backward: 0.768354, update: 4.6773002024-12-02 18:03:55,515 INFO     step: 22999, loss: 0.20442, reg: 2.0595e-03, speed: 90.85 steps/s, time: 11.01 s2024-12-02 18:03:55,515 INFO     sample: 1.491682, forward: 4.063816, backward: 0.841389, update: 4.6007082024-12-02 18:04:06,237 INFO     step: 23999, loss: 0.20425, reg: 2.0644e-03, speed: 93.27 steps/s, time: 10.72 s2024-12-02 18:04:06,237 INFO     sample: 1.510683, forward: 3.982136, backward: 0.715879, update: 4.5041182024-12-02 18:04:06,237 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:01<00:00,  5.13it/s]2024-12-02 18:05:07,409 INFO     -------------- valid result --------------2024-12-02 18:05:07,409 INFO     t,r->h  |MRR: 0.0061128707602620125 MR: 12959.571 HITS@1: 0.0006 HITS@3: 0.0024 HITS@10: 0.01142024-12-02 18:05:07,409 INFO     h,r->t  |MRR: 0.6111788749694824 MR: 324.1712 HITS@1: 0.4724 HITS@3: 0.6978 HITS@10: 0.87142024-12-02 18:05:07,410 INFO     average |MRR: 0.3086458742618561 MR: 6641.8711 HITS@1: 0.2365 HITS@3: 0.35009999999999997 HITS@10: 0.441399999999999962024-12-02 18:05:07,410 INFO     -----------------------------------------2024-12-02 18:05:07,428 INFO     [evaluation] finished! It takes 61.1903 sec s2024-12-02 18:05:18,428 INFO     step: 24999, loss: 0.20312, reg: 2.0670e-03, speed: 13.85 steps/s, time: 72.19 s2024-12-02 18:05:18,428 INFO     sample: 1.509715, forward: 4.020733, backward: 0.737582, update: 4.7228282024-12-02 18:05:28,551 INFO     step: 25999, loss: 0.20132, reg: 2.0727e-03, speed: 98.78 steps/s, time: 10.12 s2024-12-02 18:05:28,552 INFO     sample: 0.561651, forward: 4.033424, backward: 0.810810, update: 4.7061332024-12-02 18:05:39,378 INFO     step: 26999, loss: 0.19849, reg: 2.0765e-03, speed: 92.38 steps/s, time: 10.82 s2024-12-02 18:05:39,378 INFO     sample: 1.559905, forward: 3.985834, backward: 0.729987, update: 4.5403522024-12-02 18:05:50,471 INFO     step: 27999, loss: 0.19730, reg: 2.0781e-03, speed: 90.15 steps/s, time: 11.09 s2024-12-02 18:05:50,471 INFO     sample: 1.559080, forward: 4.042983, backward: 0.734473, update: 4.7474512024-12-02 18:06:01,203 INFO     step: 28999, loss: 0.19672, reg: 2.0815e-03, speed: 93.18 steps/s, time: 10.73 s2024-12-02 18:06:01,204 INFO     sample: 1.507630, forward: 4.036844, backward: 0.734679, update: 4.4443772024-12-02 18:06:12,115 INFO     step: 29999, loss: 0.19574, reg: 2.0834e-03, speed: 91.65 steps/s, time: 10.91 s2024-12-02 18:06:12,115 INFO     sample: 1.571107, forward: 3.980559, backward: 0.763590, update: 4.5860562024-12-02 18:06:21,961 INFO     step: 30999, loss: 0.19457, reg: 2.0877e-03, speed: 101.56 steps/s, time: 9.85 s2024-12-02 18:06:21,961 INFO     sample: 0.569508, forward: 3.990863, backward: 0.762789, update: 4.5132252024-12-02 18:06:32,969 INFO     step: 31999, loss: 0.19238, reg: 2.0913e-03, speed: 90.84 steps/s, time: 11.01 s2024-12-02 18:06:32,970 INFO     sample: 1.488254, forward: 4.046757, backward: 0.736669, update: 4.7265102024-12-02 18:06:43,727 INFO     step: 32999, loss: 0.19179, reg: 2.0939e-03, speed: 92.96 steps/s, time: 10.76 s2024-12-02 18:06:43,727 INFO     sample: 1.450147, forward: 4.031254, backward: 0.713830, update: 4.5533352024-12-02 18:06:54,476 INFO     step: 33999, loss: 0.19153, reg: 2.0957e-03, speed: 93.03 steps/s, time: 10.75 s2024-12-02 18:06:54,477 INFO     sample: 1.525651, forward: 3.943282, backward: 0.724953, update: 4.5463922024-12-02 18:07:05,339 INFO     step: 34999, loss: 0.19035, reg: 2.0975e-03, speed: 92.06 steps/s, time: 10.86 s2024-12-02 18:07:05,339 INFO     sample: 1.595187, forward: 3.962639, backward: 0.729714, update: 4.5657832024-12-02 18:07:15,297 INFO     step: 35999, loss: 0.18968, reg: 2.1017e-03, speed: 100.43 steps/s, time: 9.96 s2024-12-02 18:07:15,297 INFO     sample: 0.551351, forward: 3.950835, backward: 0.763969, update: 4.6809002024-12-02 18:07:26,293 INFO     step: 36999, loss: 0.18803, reg: 2.1060e-03, speed: 90.94 steps/s, time: 11.00 s2024-12-02 18:07:26,293 INFO     sample: 1.556007, forward: 3.950179, backward: 0.737953, update: 4.7428732024-12-02 18:07:29,057 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:01<00:00,  5.11it/s]2024-12-02 18:08:30,470 INFO     -------------- valid result --------------2024-12-02 18:08:30,470 INFO     t,r->h  |MRR: 0.008901488035917282 MR: 12475.6678 HITS@1: 0.0016 HITS@3: 0.0052 HITS@10: 0.01722024-12-02 18:08:30,470 INFO     h,r->t  |MRR: 0.6207280158996582 MR: 281.0964 HITS@1: 0.4752 HITS@3: 0.722 HITS@10: 0.88482024-12-02 18:08:30,470 INFO     average |MRR: 0.3148147463798523 MR: 6378.3821 HITS@1: 0.2384 HITS@3: 0.3636 HITS@10: 0.4512024-12-02 18:08:30,470 INFO     -----------------------------------------2024-12-02 18:08:30,489 INFO     [evaluation] finished! It takes 61.4317 sec s
登录后复制        

训练DistMult

In [6]
# 训练DistMult!python -u train.py --model_name DistMult \                    --data_name  OpenBG500\                    --data_path  /home/aistudio/data/\                    --save_path /home/aistudio/result/Distmult \                    --batch_size 1000 --test_batch_size 16 --log_interval 1000 --eval_interval 24000  --neg_sample_type 'chunk' \                    --num_workers 2 --neg_sample_size 200 --embed_dim 400 --gamma 143.0 --lr 0.08 --optimizer adagrad \                    -adv --num_epoch 30 --filter_eval --print_on_screen --reg_coef 2e-6 --reg_norm 3 --valid
登录后复制        
----------------------------------------        Device Setting        ---------------------------------------- Entity   embedding place: gpu Relation embedding place: gpu--------------------------------------------------------------------------------       Embedding Setting      ---------------------------------------- Entity   embedding dimension: 400 Relation embedding dimension: 400----------------------------------------2024-12-02 18:08:50,186 INFO     seed                :02024-12-02 18:08:50,186 INFO     data_path           :/home/aistudio/data/2024-12-02 18:08:50,186 INFO     save_path           :/home/aistudio/result/Distmult/distmult_OpenBG500_d_400_g_143.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.08_0.1_KGE2024-12-02 18:08:50,186 INFO     init_from_ckpt      :None2024-12-02 18:08:50,186 INFO     data_name           :OpenBG5002024-12-02 18:08:50,186 INFO     use_dict            :False2024-12-02 18:08:50,186 INFO     kv_mode             :False2024-12-02 18:08:50,187 INFO     batch_size          :10002024-12-02 18:08:50,187 INFO     test_batch_size     :162024-12-02 18:08:50,187 INFO     neg_sample_size     :2002024-12-02 18:08:50,187 INFO     filter_eval         :True2024-12-02 18:08:50,187 INFO     model_name          :distmult2024-12-02 18:08:50,187 INFO     embed_dim           :4002024-12-02 18:08:50,187 INFO     reg_coef            :2e-062024-12-02 18:08:50,187 INFO     loss_type           :Logsigmoid2024-12-02 18:08:50,187 INFO     max_steps           :20000002024-12-02 18:08:50,187 INFO     lr                  :0.082024-12-02 18:08:50,187 INFO     optimizer           :adagrad2024-12-02 18:08:50,187 INFO     cpu_lr              :0.12024-12-02 18:08:50,187 INFO     cpu_optimizer       :adagrad2024-12-02 18:08:50,187 INFO     mix_cpu_gpu         :False2024-12-02 18:08:50,187 INFO     async_update        :False2024-12-02 18:08:50,187 INFO     valid               :True2024-12-02 18:08:50,187 INFO     test                :False2024-12-02 18:08:50,187 INFO     task_name           :KGE2024-12-02 18:08:50,187 INFO     num_workers         :22024-12-02 18:08:50,187 INFO     neg_sample_type     :chunk2024-12-02 18:08:50,187 INFO     neg_deg_sample      :False2024-12-02 18:08:50,187 INFO     neg_adversarial_sampling:True2024-12-02 18:08:50,187 INFO     adversarial_temperature:1.02024-12-02 18:08:50,187 INFO     filter_sample       :False2024-12-02 18:08:50,187 INFO     valid_percent       :1.02024-12-02 18:08:50,188 INFO     use_feature         :False2024-12-02 18:08:50,188 INFO     reg_type            :norm_er2024-12-02 18:08:50,188 INFO     reg_norm            :32024-12-02 18:08:50,188 INFO     weighted_loss       :False2024-12-02 18:08:50,188 INFO     margin              :1.02024-12-02 18:08:50,188 INFO     pairwise            :False2024-12-02 18:08:50,188 INFO     gamma               :143.02024-12-02 18:08:50,188 INFO     ote_scale           :02024-12-02 18:08:50,188 INFO     ote_size            :12024-12-02 18:08:50,188 INFO     quate_lmbda1        :0.02024-12-02 18:08:50,188 INFO     quate_lmbda2        :0.02024-12-02 18:08:50,188 INFO     num_epoch           :302024-12-02 18:08:50,188 INFO     scheduler_interval  :-12024-12-02 18:08:50,188 INFO     num_process         :12024-12-02 18:08:50,188 INFO     print_on_screen     :True2024-12-02 18:08:50,188 INFO     log_interval        :10002024-12-02 18:08:50,188 INFO     save_interval       :-12024-12-02 18:08:50,188 INFO     eval_interval       :240002024-12-02 18:08:50,188 INFO     ent_emb_on_cpu      :False2024-12-02 18:08:50,188 INFO     rel_emb_on_cpu      :False2024-12-02 18:08:50,188 INFO     use_embedding_regularization:True2024-12-02 18:08:50,188 INFO     ent_dim             :4002024-12-02 18:08:50,188 INFO     rel_dim             :4002024-12-02 18:08:50,188 INFO     num_chunks          :5W1202 18:09:05.583375  4119 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1202 18:09:05.586427  4119 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2./opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method.  DeprecationWarning,2024-12-02 18:09:18,491 INFO     step: 999, loss: 0.65767, reg: 3.0928e-02, speed: 88.19 steps/s, time: 11.34 s2024-12-02 18:09:18,491 INFO     sample: 1.415885, forward: 4.589931, backward: 0.587078, update: 4.7364942024-12-02 18:09:28,727 INFO     step: 1999, loss: 0.47731, reg: 5.1126e-02, speed: 97.69 steps/s, time: 10.24 s2024-12-02 18:09:28,727 INFO     sample: 1.130671, forward: 3.708618, backward: 0.588126, update: 4.7989842024-12-02 18:09:38,812 INFO     step: 2999, loss: 0.36043, reg: 7.8275e-02, speed: 99.18 steps/s, time: 10.08 s2024-12-02 18:09:38,812 INFO     sample: 1.150921, forward: 3.723767, backward: 0.571187, update: 4.6285712024-12-02 18:09:48,997 INFO     step: 3999, loss: 0.26451, reg: 1.0150e-01, speed: 98.19 steps/s, time: 10.18 s2024-12-02 18:09:48,997 INFO     sample: 1.137154, forward: 3.755923, backward: 0.576677, update: 4.7061782024-12-02 18:09:59,035 INFO     step: 4999, loss: 0.22269, reg: 1.0864e-01, speed: 99.62 steps/s, time: 10.04 s2024-12-02 18:09:59,036 INFO     sample: 1.155761, forward: 3.717498, backward: 0.594281, update: 4.5608832024-12-02 18:10:08,577 INFO     step: 5999, loss: 0.19857, reg: 1.1085e-01, speed: 104.81 steps/s, time: 9.54 s2024-12-02 18:10:08,578 INFO     sample: 0.539480, forward: 3.752833, backward: 0.611382, update: 4.6272522024-12-02 18:10:18,732 INFO     step: 6999, loss: 0.18926, reg: 1.1167e-01, speed: 98.48 steps/s, time: 10.15 s2024-12-02 18:10:18,732 INFO     sample: 1.147535, forward: 3.698858, backward: 0.579680, update: 4.7189012024-12-02 18:10:29,036 INFO     step: 7999, loss: 0.18307, reg: 1.1203e-01, speed: 97.06 steps/s, time: 10.30 s2024-12-02 18:10:29,036 INFO     sample: 1.256027, forward: 3.665453, backward: 0.609613, update: 4.7615582024-12-02 18:10:39,387 INFO     step: 8999, loss: 0.17884, reg: 1.1225e-01, speed: 96.61 steps/s, time: 10.35 s2024-12-02 18:10:39,387 INFO     sample: 1.297724, forward: 3.694467, backward: 0.585683, update: 4.7635232024-12-02 18:10:49,545 INFO     step: 9999, loss: 0.17628, reg: 1.1231e-01, speed: 98.45 steps/s, time: 10.16 s2024-12-02 18:10:49,545 INFO     sample: 1.148723, forward: 3.705680, backward: 0.581249, update: 4.7123432024-12-02 18:10:59,257 INFO     step: 10999, loss: 0.17163, reg: 1.1230e-01, speed: 102.97 steps/s, time: 9.71 s2024-12-02 18:10:59,257 INFO     sample: 0.656959, forward: 3.726715, backward: 0.609718, update: 4.7078002024-12-02 18:11:09,316 INFO     step: 11999, loss: 0.16926, reg: 1.1216e-01, speed: 99.41 steps/s, time: 10.06 s2024-12-02 18:11:09,316 INFO     sample: 1.168218, forward: 3.704518, backward: 0.582222, update: 4.5948222024-12-02 18:11:19,360 INFO     step: 12999, loss: 0.16832, reg: 1.1201e-01, speed: 99.57 steps/s, time: 10.04 s2024-12-02 18:11:19,360 INFO     sample: 1.115536, forward: 3.718422, backward: 0.570747, update: 4.6294762024-12-02 18:11:29,518 INFO     step: 13999, loss: 0.16686, reg: 1.1190e-01, speed: 98.45 steps/s, time: 10.16 s2024-12-02 18:11:29,518 INFO     sample: 1.128502, forward: 3.734195, backward: 0.591375, update: 4.6942442024-12-02 18:11:39,692 INFO     step: 14999, loss: 0.16598, reg: 1.1179e-01, speed: 98.29 steps/s, time: 10.17 s2024-12-02 18:11:39,692 INFO     sample: 1.130593, forward: 3.707155, backward: 0.588573, update: 4.7380012024-12-02 18:11:48,977 INFO     step: 15999, loss: 0.16416, reg: 1.1161e-01, speed: 107.71 steps/s, time: 9.28 s2024-12-02 18:11:48,977 INFO     sample: 0.525574, forward: 3.727848, backward: 0.585805, update: 4.4356222024-12-02 18:11:59,296 INFO     step: 16999, loss: 0.16211, reg: 1.1149e-01, speed: 96.91 steps/s, time: 10.32 s2024-12-02 18:11:59,296 INFO     sample: 1.236274, forward: 3.714505, backward: 0.597196, update: 4.7604562024-12-02 18:12:09,569 INFO     step: 17999, loss: 0.16160, reg: 1.1128e-01, speed: 97.35 steps/s, time: 10.27 s2024-12-02 18:12:09,569 INFO     sample: 1.280665, forward: 3.682961, backward: 0.573797, update: 4.7261572024-12-02 18:12:19,900 INFO     step: 18999, loss: 0.16120, reg: 1.1113e-01, speed: 96.80 steps/s, time: 10.33 s2024-12-02 18:12:19,900 INFO     sample: 1.334101, forward: 3.705495, backward: 0.583138, update: 4.6987042024-12-02 18:12:30,103 INFO     step: 19999, loss: 0.16087, reg: 1.1097e-01, speed: 98.01 steps/s, time: 10.20 s2024-12-02 18:12:30,104 INFO     sample: 1.148571, forward: 3.696767, backward: 0.588128, update: 4.7605862024-12-02 18:12:40,133 INFO     step: 20999, loss: 0.15984, reg: 1.1080e-01, speed: 99.71 steps/s, time: 10.03 s2024-12-02 18:12:40,133 INFO     sample: 0.571947, forward: 3.864387, backward: 0.747324, update: 4.8354152024-12-02 18:12:50,892 INFO     step: 21999, loss: 0.15812, reg: 1.1065e-01, speed: 92.95 steps/s, time: 10.76 s2024-12-02 18:12:50,892 INFO     sample: 1.227075, forward: 3.768062, backward: 0.633664, update: 5.1177992024-12-02 18:13:01,230 INFO     step: 22999, loss: 0.15765, reg: 1.1051e-01, speed: 96.74 steps/s, time: 10.34 s2024-12-02 18:13:01,230 INFO     sample: 1.141957, forward: 3.835448, backward: 0.589297, update: 4.7616782024-12-02 18:13:11,617 INFO     step: 23999, loss: 0.15778, reg: 1.1034e-01, speed: 96.27 steps/s, time: 10.39 s2024-12-02 18:13:11,618 INFO     sample: 1.147387, forward: 3.803446, backward: 0.597208, update: 4.8296012024-12-02 18:13:11,618 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [00:48<00:00,  6.45it/s]2024-12-02 18:14:00,294 INFO     -------------- valid result --------------2024-12-02 18:14:00,294 INFO     t,r->h  |MRR: 0.02051205188035965 MR: 14726.6136 HITS@1: 0.0072 HITS@3: 0.0168 HITS@10: 0.03982024-12-02 18:14:00,294 INFO     h,r->t  |MRR: 0.37609413266181946 MR: 2597.6142 HITS@1: 0.2858 HITS@3: 0.4226 HITS@10: 0.53822024-12-02 18:14:00,294 INFO     average |MRR: 0.19830308854579926 MR: 8662.1139 HITS@1: 0.1465 HITS@3: 0.21969999999999998 HITS@10: 0.289000000000000032024-12-02 18:14:00,294 INFO     -----------------------------------------2024-12-02 18:14:00,312 INFO     [evaluation] finished! It takes 48.6946 sec s2024-12-02 18:14:10,745 INFO     step: 24999, loss: 0.15764, reg: 1.1025e-01, speed: 16.91 steps/s, time: 59.13 s2024-12-02 18:14:10,745 INFO     sample: 1.156561, forward: 3.777792, backward: 0.597010, update: 4.8911002024-12-02 18:14:20,353 INFO     step: 25999, loss: 0.15691, reg: 1.1009e-01, speed: 104.08 steps/s, time: 9.61 s2024-12-02 18:14:20,353 INFO     sample: 0.534036, forward: 3.759421, backward: 0.578114, update: 4.7275322024-12-02 18:14:30,873 INFO     step: 26999, loss: 0.15545, reg: 1.0995e-01, speed: 95.05 steps/s, time: 10.52 s2024-12-02 18:14:30,874 INFO     sample: 1.198932, forward: 3.849348, backward: 0.617739, update: 4.8445652024-12-02 18:14:41,429 INFO     step: 27999, loss: 0.15578, reg: 1.0980e-01, speed: 94.74 steps/s, time: 10.56 s2024-12-02 18:14:41,430 INFO     sample: 1.198625, forward: 3.796988, backward: 0.616503, update: 4.9319342024-12-02 18:14:51,571 INFO     step: 28999, loss: 0.15543, reg: 1.0968e-01, speed: 98.61 steps/s, time: 10.14 s2024-12-02 18:14:51,571 INFO     sample: 1.192123, forward: 3.727292, backward: 0.587189, update: 4.6248932024-12-02 18:15:01,772 INFO     step: 29999, loss: 0.15530, reg: 1.0959e-01, speed: 98.03 steps/s, time: 10.20 s2024-12-02 18:15:01,772 INFO     sample: 1.146616, forward: 3.724371, backward: 0.591486, update: 4.7288662024-12-02 18:15:11,202 INFO     step: 30999, loss: 0.15507, reg: 1.0945e-01, speed: 106.04 steps/s, time: 9.43 s2024-12-02 18:15:11,203 INFO     sample: 0.540366, forward: 3.709316, backward: 0.595906, update: 4.5748842024-12-02 18:15:21,339 INFO     step: 31999, loss: 0.15347, reg: 1.0933e-01, speed: 98.67 steps/s, time: 10.14 s2024-12-02 18:15:21,339 INFO     sample: 1.132487, forward: 3.742264, backward: 0.583816, update: 4.6674462024-12-02 18:15:31,489 INFO     step: 32999, loss: 0.15357, reg: 1.0922e-01, speed: 98.53 steps/s, time: 10.15 s2024-12-02 18:15:31,489 INFO     sample: 1.173653, forward: 3.757434, backward: 0.608012, update: 4.5997032024-12-02 18:15:41,934 INFO     step: 33999, loss: 0.15379, reg: 1.0911e-01, speed: 95.74 steps/s, time: 10.44 s2024-12-02 18:15:41,935 INFO     sample: 1.167904, forward: 3.821983, backward: 0.691112, update: 4.7533602024-12-02 18:15:51,966 INFO     step: 34999, loss: 0.15405, reg: 1.0901e-01, speed: 99.68 steps/s, time: 10.03 s2024-12-02 18:15:51,967 INFO     sample: 1.086866, forward: 3.739779, backward: 0.579035, update: 4.6165172024-12-02 18:16:01,354 INFO     step: 35999, loss: 0.15383, reg: 1.0892e-01, speed: 106.53 steps/s, time: 9.39 s2024-12-02 18:16:01,354 INFO     sample: 0.516522, forward: 3.756625, backward: 0.573577, update: 4.5318232024-12-02 18:16:11,496 INFO     step: 36999, loss: 0.15245, reg: 1.0879e-01, speed: 98.60 steps/s, time: 10.14 s2024-12-02 18:16:11,497 INFO     sample: 1.118747, forward: 3.701485, backward: 0.576978, update: 4.7364082024-12-02 18:16:14,089 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [00:47<00:00,  6.55it/s]2024-12-02 18:17:02,014 INFO     -------------- valid result --------------2024-12-02 18:17:02,014 INFO     t,r->h  |MRR: 0.021342920139431953 MR: 14117.3868 HITS@1: 0.0074 HITS@3: 0.0158 HITS@10: 0.04762024-12-02 18:17:02,014 INFO     h,r->t  |MRR: 0.40351319313049316 MR: 2034.8442 HITS@1: 0.3066 HITS@3: 0.453 HITS@10: 0.5842024-12-02 18:17:02,014 INFO     average |MRR: 0.21242806315422058 MR: 8076.1155 HITS@1: 0.157 HITS@3: 0.2344 HITS@10: 0.315799999999999972024-12-02 18:17:02,014 INFO     -----------------------------------------2024-12-02 18:17:02,033 INFO     [evaluation] finished! It takes 47.9434 sec s
登录后复制        

训练ComplEx

In [8]
# ComplEx!python -u train.py --model_name ComplEx  \                    --data_name  OpenBG500\                    --data_path  /home/aistudio/data/\                    --save_path /home/aistudio/result/Complex \                    --batch_size 1000 --log_interval 1000  --test_batch_size 16 --neg_sample_type 'chunk' --num_workers 2 \                    --neg_sample_size 200 --embed_dim 400 --gamma 143.0 --lr 0.1 --optimizer adagrad --reg_coef 2e-6 \                    --valid -adv --num_epoch 30 --filter_eval --print_on_screen
登录后复制        
----------------------------------------        Device Setting        ---------------------------------------- Entity   embedding place: gpu Relation embedding place: gpu--------------------------------------------------------------------------------       Embedding Setting      ---------------------------------------- Entity   embedding dimension: 800 Relation embedding dimension: 800----------------------------------------2024-12-02 18:19:13,502 INFO     seed                :02024-12-02 18:19:13,502 INFO     data_path           :/home/aistudio/data/2024-12-02 18:19:13,502 INFO     save_path           :/home/aistudio/result/Complex/complex_OpenBG500_d_400_g_143.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.1_0.1_KGE2024-12-02 18:19:13,502 INFO     init_from_ckpt      :None2024-12-02 18:19:13,502 INFO     data_name           :OpenBG5002024-12-02 18:19:13,502 INFO     use_dict            :False2024-12-02 18:19:13,502 INFO     kv_mode             :False2024-12-02 18:19:13,503 INFO     batch_size          :10002024-12-02 18:19:13,503 INFO     test_batch_size     :162024-12-02 18:19:13,503 INFO     neg_sample_size     :2002024-12-02 18:19:13,503 INFO     filter_eval         :True2024-12-02 18:19:13,503 INFO     model_name          :complex2024-12-02 18:19:13,503 INFO     embed_dim           :4002024-12-02 18:19:13,503 INFO     reg_coef            :2e-062024-12-02 18:19:13,503 INFO     loss_type           :Logsigmoid2024-12-02 18:19:13,503 INFO     max_steps           :20000002024-12-02 18:19:13,503 INFO     lr                  :0.12024-12-02 18:19:13,503 INFO     optimizer           :adagrad2024-12-02 18:19:13,503 INFO     cpu_lr              :0.12024-12-02 18:19:13,503 INFO     cpu_optimizer       :adagrad2024-12-02 18:19:13,503 INFO     mix_cpu_gpu         :False2024-12-02 18:19:13,503 INFO     async_update        :False2024-12-02 18:19:13,503 INFO     valid               :True2024-12-02 18:19:13,503 INFO     test                :False2024-12-02 18:19:13,503 INFO     task_name           :KGE2024-12-02 18:19:13,503 INFO     num_workers         :22024-12-02 18:19:13,503 INFO     neg_sample_type     :chunk2024-12-02 18:19:13,503 INFO     neg_deg_sample      :False2024-12-02 18:19:13,503 INFO     neg_adversarial_sampling:True2024-12-02 18:19:13,503 INFO     adversarial_temperature:1.02024-12-02 18:19:13,503 INFO     filter_sample       :False2024-12-02 18:19:13,504 INFO     valid_percent       :1.02024-12-02 18:19:13,504 INFO     use_feature         :False2024-12-02 18:19:13,504 INFO     reg_type            :norm_er2024-12-02 18:19:13,504 INFO     reg_norm            :32024-12-02 18:19:13,504 INFO     weighted_loss       :False2024-12-02 18:19:13,504 INFO     margin              :1.02024-12-02 18:19:13,504 INFO     pairwise            :False2024-12-02 18:19:13,504 INFO     gamma               :143.02024-12-02 18:19:13,504 INFO     ote_scale           :02024-12-02 18:19:13,504 INFO     ote_size            :12024-12-02 18:19:13,504 INFO     quate_lmbda1        :0.02024-12-02 18:19:13,504 INFO     quate_lmbda2        :0.02024-12-02 18:19:13,504 INFO     num_epoch           :302024-12-02 18:19:13,504 INFO     scheduler_interval  :-12024-12-02 18:19:13,504 INFO     num_process         :12024-12-02 18:19:13,504 INFO     print_on_screen     :True2024-12-02 18:19:13,504 INFO     log_interval        :10002024-12-02 18:19:13,504 INFO     save_interval       :-12024-12-02 18:19:13,504 INFO     eval_interval       :500002024-12-02 18:19:13,504 INFO     ent_emb_on_cpu      :False2024-12-02 18:19:13,504 INFO     rel_emb_on_cpu      :False2024-12-02 18:19:13,504 INFO     use_embedding_regularization:True2024-12-02 18:19:13,504 INFO     ent_dim             :8002024-12-02 18:19:13,504 INFO     rel_dim             :8002024-12-02 18:19:13,505 INFO     num_chunks          :5W1202 18:19:30.590066  6301 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1202 18:19:30.593201  6301 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2./opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method.  DeprecationWarning,2024-12-02 18:19:49,241 INFO     step: 999, loss: 0.60153, reg: 4.4154e-02, speed: 58.71 steps/s, time: 17.03 s2024-12-02 18:19:49,241 INFO     sample: 1.430161, forward: 10.054655, backward: 0.936485, update: 4.6026252024-12-02 18:20:05,297 INFO     step: 1999, loss: 0.37613, reg: 7.2329e-02, speed: 62.28 steps/s, time: 16.06 s2024-12-02 18:20:05,298 INFO     sample: 1.160390, forward: 9.158856, backward: 0.912364, update: 4.8142982024-12-02 18:20:21,512 INFO     step: 2999, loss: 0.24694, reg: 9.4915e-02, speed: 61.67 steps/s, time: 16.21 s2024-12-02 18:20:21,512 INFO     sample: 1.209912, forward: 9.202024, backward: 1.047861, update: 4.7433572024-12-02 18:20:37,306 INFO     step: 3999, loss: 0.21125, reg: 9.8401e-02, speed: 63.32 steps/s, time: 15.79 s2024-12-02 18:20:37,306 INFO     sample: 1.128468, forward: 9.161491, backward: 0.928098, update: 4.5649742024-12-02 18:20:52,975 INFO     step: 4999, loss: 0.19485, reg: 9.8803e-02, speed: 63.82 steps/s, time: 15.67 s2024-12-02 18:20:52,976 INFO     sample: 1.157603, forward: 9.174937, backward: 0.910179, update: 4.4166902024-12-02 18:21:08,161 INFO     step: 5999, loss: 0.18083, reg: 9.8726e-02, speed: 65.85 steps/s, time: 15.19 s2024-12-02 18:21:08,161 INFO     sample: 0.545803, forward: 9.165393, backward: 0.943437, update: 4.5195302024-12-02 18:21:23,969 INFO     step: 6999, loss: 0.17592, reg: 9.8283e-02, speed: 63.26 steps/s, time: 15.81 s2024-12-02 18:21:23,969 INFO     sample: 1.130007, forward: 9.167275, backward: 0.914725, update: 4.5856092024-12-02 18:21:39,742 INFO     step: 7999, loss: 0.17271, reg: 9.7748e-02, speed: 63.41 steps/s, time: 15.77 s2024-12-02 18:21:39,742 INFO     sample: 1.149211, forward: 9.181113, backward: 0.915362, update: 4.5158982024-12-02 18:21:55,700 INFO     step: 8999, loss: 0.16979, reg: 9.7300e-02, speed: 62.66 steps/s, time: 15.96 s2024-12-02 18:21:55,701 INFO     sample: 1.251904, forward: 9.145716, backward: 0.920547, update: 4.6293782024-12-02 18:22:11,576 INFO     step: 9999, loss: 0.16876, reg: 9.6864e-02, speed: 62.99 steps/s, time: 15.87 s2024-12-02 18:22:11,576 INFO     sample: 1.144423, forward: 9.155142, backward: 0.940488, update: 4.6238802024-12-02 18:22:26,903 INFO     step: 10999, loss: 0.16426, reg: 9.6493e-02, speed: 65.24 steps/s, time: 15.33 s2024-12-02 18:22:26,903 INFO     sample: 0.700834, forward: 9.158820, backward: 0.920809, update: 4.5366752024-12-02 18:22:42,639 INFO     step: 11999, loss: 0.16180, reg: 9.6037e-02, speed: 63.55 steps/s, time: 15.73 s2024-12-02 18:22:42,639 INFO     sample: 1.182562, forward: 9.178500, backward: 0.911814, update: 4.4522072024-12-02 18:22:58,673 INFO     step: 12999, loss: 0.16086, reg: 9.5627e-02, speed: 62.37 steps/s, time: 16.03 s2024-12-02 18:22:58,673 INFO     sample: 1.191781, forward: 9.131900, backward: 0.937113, update: 4.7616832024-12-02 18:23:14,506 INFO     step: 13999, loss: 0.16056, reg: 9.5248e-02, speed: 63.16 steps/s, time: 15.83 s2024-12-02 18:23:14,506 INFO     sample: 1.133638, forward: 9.148603, backward: 0.923684, update: 4.6163512024-12-02 18:23:30,298 INFO     step: 14999, loss: 0.16007, reg: 9.4936e-02, speed: 63.33 steps/s, time: 15.79 s2024-12-02 18:23:30,298 INFO     sample: 1.127198, forward: 9.169777, backward: 0.923043, update: 4.5610792024-12-02 18:23:45,355 INFO     step: 15999, loss: 0.15787, reg: 9.4598e-02, speed: 66.41 steps/s, time: 15.06 s2024-12-02 18:23:45,356 INFO     sample: 0.539107, forward: 9.188465, backward: 0.938140, update: 4.3804242024-12-02 18:24:01,272 INFO     step: 16999, loss: 0.15570, reg: 9.4335e-02, speed: 62.83 steps/s, time: 15.92 s2024-12-02 18:24:01,273 INFO     sample: 1.231464, forward: 9.158331, backward: 0.924203, update: 4.5921282024-12-02 18:24:17,703 INFO     step: 17999, loss: 0.15567, reg: 9.3985e-02, speed: 60.86 steps/s, time: 16.43 s2024-12-02 18:24:17,704 INFO     sample: 1.202493, forward: 9.243131, backward: 1.140142, update: 4.8342522024-12-02 18:24:33,842 INFO     step: 18999, loss: 0.15539, reg: 9.3735e-02, speed: 61.97 steps/s, time: 16.14 s2024-12-02 18:24:33,842 INFO     sample: 1.403193, forward: 9.119840, backward: 0.937885, update: 4.6665092024-12-02 18:24:49,843 INFO     step: 19999, loss: 0.15579, reg: 9.3491e-02, speed: 62.50 steps/s, time: 16.00 s2024-12-02 18:24:49,843 INFO     sample: 1.191690, forward: 9.132360, backward: 0.938562, update: 4.7262862024-12-02 18:25:04,912 INFO     step: 20999, loss: 0.15388, reg: 9.3330e-02, speed: 66.36 steps/s, time: 15.07 s2024-12-02 18:25:04,912 INFO     sample: 0.530497, forward: 9.179391, backward: 0.917852, update: 4.4310052024-12-02 18:25:20,711 INFO     step: 21999, loss: 0.15223, reg: 9.3063e-02, speed: 63.30 steps/s, time: 15.80 s2024-12-02 18:25:20,711 INFO     sample: 1.115061, forward: 9.183473, backward: 0.901767, update: 4.5891422024-12-02 18:25:36,459 INFO     step: 22999, loss: 0.15214, reg: 9.2812e-02, speed: 63.50 steps/s, time: 15.75 s2024-12-02 18:25:36,459 INFO     sample: 1.134653, forward: 9.177601, backward: 0.925939, update: 4.4995982024-12-02 18:25:52,546 INFO     step: 23999, loss: 0.15249, reg: 9.2637e-02, speed: 62.16 steps/s, time: 16.09 s2024-12-02 18:25:52,546 INFO     sample: 1.135302, forward: 9.202430, backward: 0.945634, update: 4.7937042024-12-02 18:26:08,196 INFO     step: 24999, loss: 0.15223, reg: 9.2440e-02, speed: 63.90 steps/s, time: 15.65 s2024-12-02 18:26:08,196 INFO     sample: 1.055773, forward: 9.212041, backward: 0.885661, update: 4.4872272024-12-02 18:26:23,154 INFO     step: 25999, loss: 0.15170, reg: 9.2288e-02, speed: 66.86 steps/s, time: 14.96 s2024-12-02 18:26:23,155 INFO     sample: 0.503797, forward: 9.239100, backward: 0.877107, update: 4.3292042024-12-02 18:26:38,797 INFO     step: 26999, loss: 0.15009, reg: 9.2081e-02, speed: 63.93 steps/s, time: 15.64 s2024-12-02 18:26:38,797 INFO     sample: 1.093831, forward: 9.204090, backward: 0.905856, update: 4.4292412024-12-02 18:26:54,377 INFO     step: 27999, loss: 0.14990, reg: 9.1920e-02, speed: 64.19 steps/s, time: 15.58 s2024-12-02 18:26:54,377 INFO     sample: 1.045313, forward: 9.235479, backward: 0.879603, update: 4.4113552024-12-02 18:27:09,898 INFO     step: 28999, loss: 0.15004, reg: 9.1758e-02, speed: 64.43 steps/s, time: 15.52 s2024-12-02 18:27:09,898 INFO     sample: 1.020164, forward: 9.246398, backward: 0.882828, update: 4.3630022024-12-02 18:27:25,464 INFO     step: 29999, loss: 0.15046, reg: 9.1633e-02, speed: 64.24 steps/s, time: 15.57 s2024-12-02 18:27:25,464 INFO     sample: 1.031425, forward: 9.239507, backward: 0.881955, update: 4.4043272024-12-02 18:27:40,455 INFO     step: 30999, loss: 0.14995, reg: 9.1487e-02, speed: 66.71 steps/s, time: 14.99 s2024-12-02 18:27:40,455 INFO     sample: 0.510480, forward: 9.214767, backward: 0.893431, update: 4.3630402024-12-02 18:27:56,214 INFO     step: 31999, loss: 0.14854, reg: 9.1298e-02, speed: 63.46 steps/s, time: 15.76 s2024-12-02 18:27:56,214 INFO     sample: 1.067064, forward: 9.228900, backward: 0.917439, update: 4.5361162024-12-02 18:28:11,800 INFO     step: 32999, loss: 0.14825, reg: 9.1183e-02, speed: 64.16 steps/s, time: 15.59 s2024-12-02 18:28:11,800 INFO     sample: 1.041273, forward: 9.236031, backward: 0.882269, update: 4.4176202024-12-02 18:28:27,995 INFO     step: 33999, loss: 0.14832, reg: 9.1084e-02, speed: 61.75 steps/s, time: 16.19 s2024-12-02 18:28:27,995 INFO     sample: 1.062898, forward: 9.357672, backward: 1.267072, update: 4.4962532024-12-02 18:28:44,399 INFO     step: 34999, loss: 0.14902, reg: 9.0942e-02, speed: 60.96 steps/s, time: 16.40 s2024-12-02 18:28:44,400 INFO     sample: 1.095686, forward: 9.385500, backward: 1.307641, update: 4.6050812024-12-02 18:29:00,103 INFO     step: 35999, loss: 0.14900, reg: 9.0824e-02, speed: 63.68 steps/s, time: 15.70 s2024-12-02 18:29:00,103 INFO     sample: 0.518057, forward: 9.401174, backward: 1.304467, update: 4.4694232024-12-02 18:29:16,446 INFO     step: 36999, loss: 0.14712, reg: 9.0693e-02, speed: 61.19 steps/s, time: 16.34 s2024-12-02 18:29:16,446 INFO     sample: 1.114131, forward: 9.371427, backward: 1.286469, update: 4.5618062024-12-02 18:29:20,440 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:30<00:00,  3.44it/s]2024-12-02 18:30:51,460 INFO     -------------- valid result --------------2024-12-02 18:30:51,460 INFO     t,r->h  |MRR: 0.02636275626718998 MR: 12106.768 HITS@1: 0.011 HITS@3: 0.0218 HITS@10: 0.05122024-12-02 18:30:51,460 INFO     h,r->t  |MRR: 0.39752769470214844 MR: 1558.398 HITS@1: 0.2928 HITS@3: 0.4508 HITS@10: 0.60122024-12-02 18:30:51,461 INFO     average |MRR: 0.21194522082805634 MR: 6832.583 HITS@1: 0.1519 HITS@3: 0.23629999999999998 HITS@10: 0.32622024-12-02 18:30:51,461 INFO     -----------------------------------------2024-12-02 18:30:51,476 INFO     [evaluation] finished! It takes 91.0356 sec s
登录后复制        

训练OTE

In [10]
# OTE!python -u train.py --model_name OTE \                    --data_name  OpenBG500\                    --data_path  /home/aistudio/data/\                    --save_path /home/aistudio/result/transe \                    --batch_size 512 --log_interval 1000 --neg_sample_type 'chunk' --neg_sample_size 256 --max_steps 10000 \                    --embed_dim 400 --gamma 15.0 -adv -a 0.5 --ote_scale 2 --ote_size 20 --print_on_screen --test --filter_eval \                    --lr 0.002 --optimizer adam --scheduler_interval 25000 --valid
登录后复制        
----------------------------------------        Device Setting        ---------------------------------------- Entity   embedding place: gpu Relation embedding place: gpu--------------------------------------------------------------------------------       Embedding Setting      ---------------------------------------- Entity   embedding dimension: 400 Relation embedding dimension: 8400----------------------------------------2024-12-03 20:17:21,327 INFO     seed                :02024-12-03 20:17:21,327 INFO     data_path           :/home/aistudio/data/2024-12-03 20:17:21,327 INFO     save_path           :/home/aistudio/result/transe/ote_OpenBG500_d_400_g_15.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.002_0.1_KGE2024-12-03 20:17:21,327 INFO     init_from_ckpt      :None2024-12-03 20:17:21,328 INFO     data_name           :OpenBG5002024-12-03 20:17:21,328 INFO     use_dict            :False2024-12-03 20:17:21,328 INFO     kv_mode             :False2024-12-03 20:17:21,328 INFO     batch_size          :5122024-12-03 20:17:21,328 INFO     test_batch_size     :162024-12-03 20:17:21,328 INFO     neg_sample_size     :2562024-12-03 20:17:21,328 INFO     filter_eval         :True2024-12-03 20:17:21,328 INFO     model_name          :ote2024-12-03 20:17:21,328 INFO     embed_dim           :4002024-12-03 20:17:21,328 INFO     reg_coef            :02024-12-03 20:17:21,328 INFO     loss_type           :Logsigmoid2024-12-03 20:17:21,328 INFO     max_steps           :100002024-12-03 20:17:21,328 INFO     lr                  :0.0022024-12-03 20:17:21,328 INFO     optimizer           :adam2024-12-03 20:17:21,328 INFO     cpu_lr              :0.12024-12-03 20:17:21,328 INFO     cpu_optimizer       :adagrad2024-12-03 20:17:21,328 INFO     mix_cpu_gpu         :False2024-12-03 20:17:21,329 INFO     async_update        :False2024-12-03 20:17:21,329 INFO     valid               :True2024-12-03 20:17:21,329 INFO     test                :True2024-12-03 20:17:21,329 INFO     task_name           :KGE2024-12-03 20:17:21,329 INFO     num_workers         :02024-12-03 20:17:21,329 INFO     neg_sample_type     :chunk2024-12-03 20:17:21,329 INFO     neg_deg_sample      :False2024-12-03 20:17:21,329 INFO     neg_adversarial_sampling:True2024-12-03 20:17:21,329 INFO     adversarial_temperature:0.52024-12-03 20:17:21,329 INFO     filter_sample       :False2024-12-03 20:17:21,329 INFO     valid_percent       :1.02024-12-03 20:17:21,329 INFO     use_feature         :False2024-12-03 20:17:21,329 INFO     reg_type            :norm_er2024-12-03 20:17:21,329 INFO     reg_norm            :32024-12-03 20:17:21,329 INFO     weighted_loss       :False2024-12-03 20:17:21,329 INFO     margin              :1.02024-12-03 20:17:21,330 INFO     pairwise            :False2024-12-03 20:17:21,330 INFO     gamma               :15.02024-12-03 20:17:21,330 INFO     ote_scale           :22024-12-03 20:17:21,330 INFO     ote_size            :202024-12-03 20:17:21,330 INFO     quate_lmbda1        :0.02024-12-03 20:17:21,330 INFO     quate_lmbda2        :0.02024-12-03 20:17:21,330 INFO     num_epoch           :10000002024-12-03 20:17:21,330 INFO     scheduler_interval  :250002024-12-03 20:17:21,330 INFO     num_process         :12024-12-03 20:17:21,330 INFO     print_on_screen     :True2024-12-03 20:17:21,330 INFO     log_interval        :10002024-12-03 20:17:21,330 INFO     save_interval       :-12024-12-03 20:17:21,330 INFO     eval_interval       :500002024-12-03 20:17:21,330 INFO     ent_emb_on_cpu      :False2024-12-03 20:17:21,330 INFO     rel_emb_on_cpu      :False2024-12-03 20:17:21,330 INFO     use_embedding_regularization:False2024-12-03 20:17:21,331 INFO     ent_dim             :4002024-12-03 20:17:21,331 INFO     rel_dim             :84002024-12-03 20:17:21,331 INFO     num_chunks          :2W1203 20:17:40.313774 15093 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2W1203 20:17:40.317301 15093 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.Epoch 0: StepDecay set learning rate to 0.002.2024-12-03 20:18:30,970 INFO     step: 999, loss: 3.82042, reg: 0.0000e+00, speed: 20.39 steps/s, time: 49.06 s2024-12-03 20:18:30,970 INFO     sample: 2.537456, forward: 35.948929, backward: 10.029451, update: 0.5198032024-12-03 20:19:18,248 INFO     step: 1999, loss: 1.52916, reg: 0.0000e+00, speed: 21.15 steps/s, time: 47.28 s2024-12-03 20:19:18,248 INFO     sample: 1.869959, forward: 34.982310, backward: 9.909149, update: 0.4980442024-12-03 20:20:06,206 INFO     step: 2999, loss: 0.70538, reg: 0.0000e+00, speed: 20.85 steps/s, time: 47.96 s2024-12-03 20:20:06,206 INFO     sample: 2.374590, forward: 35.117019, backward: 9.945933, update: 0.5007092024-12-03 20:20:53,649 INFO     step: 3999, loss: 0.45315, reg: 0.0000e+00, speed: 21.08 steps/s, time: 47.44 s2024-12-03 20:20:53,650 INFO     sample: 1.874641, forward: 35.024741, backward: 10.005383, update: 0.5187272024-12-03 20:21:42,188 INFO     step: 4999, loss: 0.38411, reg: 0.0000e+00, speed: 20.60 steps/s, time: 48.54 s2024-12-03 20:21:42,189 INFO     sample: 2.392149, forward: 35.182606, backward: 10.389102, update: 0.5533132024-12-03 20:22:30,258 INFO     step: 5999, loss: 0.22264, reg: 0.0000e+00, speed: 20.80 steps/s, time: 48.07 s2024-12-03 20:22:30,258 INFO     sample: 1.858560, forward: 35.172825, backward: 10.468420, update: 0.5479932024-12-03 20:23:18,059 INFO     step: 6999, loss: 0.24304, reg: 0.0000e+00, speed: 20.92 steps/s, time: 47.80 s2024-12-03 20:23:18,060 INFO     sample: 1.789395, forward: 35.161546, backward: 10.283572, update: 0.5454242024-12-03 20:24:06,737 INFO     step: 7999, loss: 0.19999, reg: 0.0000e+00, speed: 20.54 steps/s, time: 48.68 s2024-12-03 20:24:06,738 INFO     sample: 2.542719, forward: 35.021526, backward: 10.546183, update: 0.5465232024-12-03 20:24:55,473 INFO     step: 8999, loss: 0.19842, reg: 0.0000e+00, speed: 20.52 steps/s, time: 48.74 s2024-12-03 20:24:55,474 INFO     sample: 2.101255, forward: 34.948719, backward: 11.074480, update: 0.5878842024-12-03 20:25:44,587 INFO     step: 9999, loss: 0.20394, reg: 0.0000e+00, speed: 20.36 steps/s, time: 49.11 s2024-12-03 20:25:44,588 INFO     sample: 2.529099, forward: 35.144207, backward: 10.839553, update: 0.5785732024-12-03 20:25:44,588 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:06<00:00,  4.74it/s]2024-12-03 20:26:50,790 INFO     -------------- test result --------------2024-12-03 20:26:50,791 INFO     t,r->h  |MRR: 6.202812073752284e-05 MR: 102160.9966 HITS@1: 0.0 HITS@3: 0.0 HITS@10: 0.02024-12-03 20:26:50,791 INFO     h,r->t  |MRR: 4.467178769118618e-06 MR: 226216.9044 HITS@1: 0.0 HITS@3: 0.0 HITS@10: 0.02024-12-03 20:26:50,791 INFO     average |MRR: 3.324764838907868e-05 MR: 164188.9505 HITS@1: 0.0 HITS@3: 0.0 HITS@10: 0.02024-12-03 20:26:50,791 INFO     -----------------------------------------2024-12-03 20:26:50,819 INFO     [evaluation] finished! It takes 66.2304 sec s2024-12-03 20:26:50,819 INFO     [evaluation] start...100%|█████████████████████████████████████████| 313/313 [01:05<00:00,  4.77it/s]2024-12-03 20:27:56,598 INFO     -------------- valid result --------------2024-12-03 20:27:56,598 INFO     t,r->h  |MRR: 0.014652694575488567 MR: 14285.9508 HITS@1: 0.0046 HITS@3: 0.0124 HITS@10: 0.02782024-12-03 20:27:56,598 INFO     h,r->t  |MRR: 0.5156063437461853 MR: 2179.2114 HITS@1: 0.396 HITS@3: 0.596 HITS@10: 0.73322024-12-03 20:27:56,598 INFO     average |MRR: 0.2651295065879822 MR: 8232.5811 HITS@1: 0.2003 HITS@3: 0.30419999999999997 HITS@10: 0.38052024-12-03 20:27:56,598 INFO     -----------------------------------------2024-12-03 20:27:56,624 INFO     [evaluation] finished! It takes 65.8045 sec s
登录后复制        

预测

In [ ]
%cd /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG/# 这里我们没有测试集标签,所以只需要把预测结果保存下来,这里保存预测top_10# 记得修改embed_dim,上面训练的模型嵌入维度是400,这里需要修改到一致,默认为200# 通过init_from_ckpt来加载训练的模型# 修改模式为test!python predict.py  --seed 1107 --data_name  OpenBG500\                    --data_path /home/aistudio/data/\                    --model_name TransE \                    --save_path /home/aistudio/result/pred\                    --embed_dim 400 \                    --init_from_ckpt /home/aistudio/result/transe_fb_sgpu/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE\                    --test
登录后复制    In [1]
# 输入头实体和关系 预测尾实体%cd /home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG/
登录后复制        
/home/aistudio/PGL/PGL-0121d96a5ffb385024f8ba13285da5880dd2753c/apps/Graph2KG
登录后复制登录后复制        In [2]
import pglimport osimport paddlefrom dataset.reader import read_trigraphimport paddle.distributed as distfrom models.ke_model import KGEModelclass Args:    def __init__(self):        self.seed = 0        self.data_path='/home/aistudio/data/'        self.save_path='/home/aistudio/result/transe/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE'        self.init_from_ckpt='/home/aistudio/result/transe/transe_OpenBG500_d_400_g_19.9_e_gpu_r_gpu_l_Logsigmoid_lr_0.25_0.1_KGE'        self.data_name='OpenBG500'        self.use_dict=False        self.kv_mode='kv'        self.valid_percent=1.        self.filter_sample=False        self.filter_eval=True        self.weighted_loss=False        self.model_name='transe'        self.ent_dim=400        self.rel_dim=400        self.ent_emb_on_cpu=False        self.rel_emb_on_cpu=False        self.num_chunks=5        self.cpu_lr=0.1        self.mix_cpu_gpu=False        self.gamma=19.9        self.ote_size=0         self.ote_scale=1        self.use_feature=False        self.use_dict=Falsedef build_model(args):    trigraph = read_trigraph(args.data_path, args.data_name, args.use_dict,                            args.kv_mode)    if args.valid_percent < 1:        trigraph.sampled_subgraph(args.valid_percent, dataset='valid')    use_filter_set = args.filter_sample or args.filter_eval or args.weighted_loss    if use_filter_set:        filter_dict = {            'head': trigraph.true_heads_for_tail_rel,            'tail': trigraph.true_tails_for_head_rel        }    else:        filter_dict = None    if dist.get_world_size() > 1:        dist.init_parallel_env()    model = KGEModel(args.model_name, trigraph, args)    if args.init_from_ckpt:        state_dict = paddle.load(            os.path.join(args.init_from_ckpt, 'params.pdparams'))    return model
登录后复制    In [6]
def get_dict():    rel_dict = dict()    ent_dict = dict()    with open('/home/aistudio/data/OpenBG500/relations.dict', 'r') as f:        lines = f.readlines()        for line in lines:            k, v = line.strip().split('\t')            rel_dict[k] = v    with open('/home/aistudio/data/OpenBG500/entities.dict', 'r') as f:        lines = f.readlines()        for line in lines:            k, v = line.strip().split('\t')            ent_dict[k] = v    return rel_dict, ent_dictdef do_predict(model, head, relation):    model.eval()    rel_dict, ent_dict = get_dict()    h = paddle.to_tensor([ent_dict[head]], 'int64')    r = paddle.to_tensor([rel_dict[relation]], 'int64')    with paddle.no_grad():        t_score = model.predict(h, r, mode='tail')    t_score = t_score.argsort(descending=True)    return t_score[:, :10][0]def get_key(id_):    _, ent_dict = get_dict()    for item in ent_dict.items():        if str(id_) in item:            return item[0]# ===========示例==============args = Args()model = build_model(args)head = 'ent_238303'relation = 'rel_0320'tails = do_predict(model, head, relation)print(f"得分最高的前十个实体id为: {tails.numpy()}")print(f"{head} - {relation} --> {get_key(int(tails[0]))}")
登录后复制        
得分最高的前十个实体id为: [   138 146677 221006 180932  27258 187664 145176 190357  94183  41276]ent_238303 - rel_0320 --> ent_122958
登录后复制        

参考文献

[1] Qu, Yincen, et al. "Commonsense Knowledge Salience Evaluation with a Benchmark Dataset in E-commerce." Findings of EMNLP 2024.

[2] Xie, Xin, et al. "From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer." WWW 2024 (Poster).

[3] Deng, Shumin, et al. "Construction and Applications of Billion-Scale Multimodal Pre-trained Business Knowledge Graph." arXiv preprint arXiv:2209.15214 2024.

[4] Kadlec, Rudolf, Ondrej Bajgar, and Jan Kleindienst. "Knowledge base completion: Baselines strike back." arXiv preprint arXiv:1705.10744 (2017).

[5] Trouillon, Théo, et al. "Complex embeddings for simple link prediction." International conference on machine learning. PMLR, 2016.

[6] Sun, Zhiqing, et al. "Rotate: Knowledge graph embedding by relational rotation in complex space." arXiv preprint arXiv:1902.10197 (2019).

[7] Tang, Yun, et al. "Orthogonal relation transforms with graph context modeling for knowledge graph embedding." arXiv preprint arXiv:1911.04910 (2019).

参考博客

[1] transE(Translating Embedding)详解+简单python实现 https://blog.csdn.net/shunaoxi2313/article/details/89766467

[2] 大规模开放数字商业知识图谱评测基准来了:OpenBG上线天池 https://m.thepaper.cn/baijiahao_20744274

热门推荐

更多

热门文章

更多

首页  返回顶部

本站所有软件都由网友上传,如有侵犯您的版权,请发邮件youleyoucom@outlook.com