Loading pre-trained spanBERT from ./pretrained_spanbert ____ Parameters: Client key = XXXXXX Engine key = XXXXXX Gemini key = XXXXXX Method = spanbert Relation = Schools_Attended Threshold = 0.7 Query = sergey brin stanford # of Tuples = 10 Loading necessary libraries; This should take a minute or so ...) =========== Iteration: 0 - Query: sergey brin stanford =========== URL ( 1 / 10): https://en.wikipedia.org/wiki/Sergey_Brin Fetching text from url ... Trimming webpage content from 41927 to 10000 characters Webpage length (num characters): 10000 Annotating the webpage using spacy... Extracted 69 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Processed 5 / 69 sentences Processed 10 / 69 sentences Processed 15 / 69 sentences Processed 20 / 69 sentences Processed 25 / 69 sentences === Extracted Relation === Input tokens: ['Mikhail', 'and', 'Eugenia', 'Brin', '(', '1948–2024', ')', ',', 'both', 'graduates', 'of', 'Moscow', 'State', 'University', '('] Output Confidence: 0.9907721 ; Subject: Eugenia Brin ; Object: Moscow State University ; Adding to set of extracted relations ========== Processed 30 / 69 sentences Processed 35 / 69 sentences === Extracted Relation === Input tokens: ['Brin', 'attended', 'elementary', 'school', 'at', 'Paint', 'Branch', 'Montessori', 'School', 'in', 'Adelphi', ','] Output Confidence: 0.93284553 ; Subject: Brin ; Object: Paint Branch Montessori School ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'attended', 'elementary', 'school', 'at', 'Paint', 'Branch', 'Montessori', 'School', 'in', 'Adelphi', ',', 'Maryland', ',', 'but', 'he', 'received', 'further', 'education', 'at', 'home', ';', 'his', 'father', ',', 'a', 'professor', 'in', 'the', 'department', 'of', 'mathematics', 'at', 'the', 'University', 'of', 'Maryland', ','] Output Confidence: 0.95533353 ; Subject: Brin ; Object: the University of Maryland ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'enrolled', 'in', 'the', 'University', 'of', 'Maryland', ',', 'where', 'he', 'received', 'his', 'Bachelor', 'of', 'Science', 'from', 'the', 'Department', 'of', 'Computer', 'Science', 'in', '1993', 'with', 'honors', 'in', 'computer', 'science', 'and', 'mathematics', 'at', 'the', 'age', 'of', '19.[14', ']'] Output Confidence: 0.8560143 ; Subject: Brin ; Object: Bachelor of Science ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'enrolled', 'in', 'the', 'University', 'of', 'Maryland', ',', 'where', 'he', 'received', 'his', 'Bachelor', 'of', 'Science', 'from', 'the', 'Department', 'of', 'Computer', 'Science', 'in', '1993', 'with', 'honors', 'in', 'computer', 'science', 'and', 'mathematics', 'at', 'the', 'age', 'of', '19.[14', ']'] Output Confidence: 0.9894984 ; Subject: Brin ; Object: the Department of Computer Science ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'began', 'his', 'graduate', 'study', 'in', 'computer', 'science', 'at', 'Stanford', 'University', 'on', 'a', 'graduate', 'fellowship', 'from', 'the', 'National', 'Science', 'Foundation', ','] Output Confidence: 0.9527388 ; Subject: Brin ; Object: Stanford University ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'began', 'his', 'graduate', 'study', 'in', 'computer', 'science', 'at', 'Stanford', 'University', 'on', 'a', 'graduate', 'fellowship', 'from', 'the', 'National', 'Science', 'Foundation', ','] Output Confidence: 0.6411243 ; Subject: Brin ; Object: the National Science Foundation ; Confidence is lower than threshold confidence. Ignoring this. ========== Processed 40 / 69 sentences Processed 45 / 69 sentences Processed 50 / 69 sentences Processed 55 / 69 sentences Processed 60 / 69 sentences Processed 65 / 69 sentences Extracted annotations for 4 out of total 69 sentences Relations extracted from this website: 6 (Overall: 7) URL ( 2 / 10): http://infolab.stanford.edu/~sergey/ Fetching text from url ... Webpage length (num characters): 4579 Annotating the webpage using spacy... Extracted 58 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... === Extracted Relation === Input tokens: [' ', 'Sergey', 'Brin', 'Sergey', 'Brin', "'s", 'Home', 'Page', 'Ph.D.', 'student', 'in', 'Computer', 'Science', 'at', 'Stanford', '-', 'sergey@cs.stanford.edu', 'Research', 'Currently', 'I', 'am', 'at', 'Google', '.'] Output Confidence: 0.58440864 ; Subject: Sergey Brin Sergey Brin's ; Object: Stanford - sergey@cs.stanford.edu Research ; Confidence is lower than threshold confidence. Ignoring this. ========== === Extracted Relation === Input tokens: [' ', 'Sergey', 'Brin', 'Sergey', 'Brin', "'s", 'Home', 'Page', 'Ph.D.', 'student', 'in', 'Computer', 'Science', 'at', 'Stanford', '-', 'sergey@cs.stanford.edu', 'Research', 'Currently', 'I', 'am', 'at', 'Google', '.'] Output Confidence: 0.4831584 ; Subject: Sergey Brin Sergey Brin's ; Object: Google ; Confidence is lower than threshold confidence. Ignoring this. ========== Processed 5 / 58 sentences Processed 10 / 58 sentences Processed 15 / 58 sentences Processed 20 / 58 sentences Processed 25 / 58 sentences Processed 30 / 58 sentences Processed 35 / 58 sentences Processed 40 / 58 sentences === Extracted Relation === Input tokens: ['Together', 'with', 'James', 'Davis', '(', 'another', 'Ph.D.', 'student', 'here', ')', ',', 'we', 'developed', 'COPS', ',', 'the', 'COpyright', 'Protection', 'System', '.'] Output Confidence: 0.534651 ; Subject: James Davis ; Object: the COpyright Protection System ; Confidence is lower than threshold confidence. Ignoring this. ========== Processed 45 / 58 sentences Processed 50 / 58 sentences Processed 55 / 58 sentences Extracted annotations for 2 out of total 58 sentences Relations extracted from this website: 0 (Overall: 3) URL ( 3 / 10): https://engineering.stanford.edu/about/history/heroes/2014-heroes/sergey-brin Fetching text from url ... Webpage length (num characters): 6534 Annotating the webpage using spacy... Extracted 30 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Processed 5 / 30 sentences Processed 10 / 30 sentences Processed 15 / 30 sentences === Extracted Relation === Input tokens: ['Future', 'Main', 'content', 'start', 'Sergey', 'Brin', '—', 'Google', 'co', '-', 'founder', 'Sergey', 'Brin', 'co', '-', 'founded', 'web', '-', 'search', 'giant', 'Google', 'Inc.', 'in', '1998', 'with', 'fellow', 'Stanford', 'student', 'Larry', 'Page', '.'] Output Confidence: 0.6876008 ; Subject: Sergey Brin ; Object: Stanford ; Confidence is lower than threshold confidence. Ignoring this. ========== === Extracted Relation === Input tokens: ['founder', 'Sergey', 'Brin', 'co', '-', 'founded', 'web', '-', 'search', 'giant', 'Google', 'Inc.', 'in', '1998', 'with', 'fellow', 'Stanford', 'student', 'Larry', 'Page', '.'] Output Confidence: 0.81827945 ; Subject: Sergey Brin ; Object: Stanford ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['search', 'giant', 'Google', 'Inc.', 'in', '1998', 'with', 'fellow', 'Stanford', 'student', 'Larry', 'Page', '.'] Output Confidence: 0.9513414 ; Subject: Larry Page ; Object: Stanford ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'earned', 'his', 'master', '’s', 'degree', 'in', 'computer', 'science', 'at', 'Stanford', ','] Output Confidence: 0.7757271 ; Subject: Brin ; Object: Stanford ; Adding to set of extracted relations ========== === Extracted Relation === Input tokens: ['Brin', 'earned', 'his', 'master', '’s', 'degree', 'in', 'computer', 'science', 'at', 'Stanford', ',', 'where', 'he', 'and', 'Page', 'developed', 'the', '“'] Output Confidence: 0.9301687 ; Subject: Page ; Object: Stanford ; Adding to set of extracted relations ========== Processed 20 / 30 sentences Processed 25 / 30 sentences Processed 30 / 30 sentences Extracted annotations for 2 out of total 30 sentences Relations extracted from this website: 4 (Overall: 5) URL ( 4 / 10): http://infolab.stanford.edu/pub/papers/google.pdf Fetching text from url ... Trimming webpage content from 112597 to 10000 characters Webpage length (num characters): 10000 Annotating the webpage using spacy... Extracted 60 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Processed 5 / 60 sentences Processed 10 / 60 sentences Processed 15 / 60 sentences Processed 20 / 60 sentences Processed 25 / 60 sentences Processed 30 / 60 sentences Processed 35 / 60 sentences Processed 40 / 60 sentences Processed 45 / 60 sentences Processed 50 / 60 sentences Processed 55 / 60 sentences Processed 60 / 60 sentences Extracted annotations for 0 out of total 60 sentences Relations extracted from this website: 0 (Overall: 0) URL ( 5 / 10): https://snap.stanford.edu/class/cs224w-readings/Brin98Anatomy.pdf Fetching text from url ... Unable to fetch URL. Continuing. URL ( 6 / 10): https://www.quora.com/What-was-it-like-to-be-at-Stanford-with-Sergey-Brin-and-Larry-Page Fetching text from url ... Webpage length (num characters): 194 Annotating the webpage using spacy... Extracted 4 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Extracted annotations for 0 out of total 4 sentences Relations extracted from this website: 0 (Overall: 0) URL ( 7 / 10): https://about.google/intl/ALL_us/our-story/ Fetching text from url ... Webpage length (num characters): 3428 Annotating the webpage using spacy... Extracted 24 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... === Extracted Relation === Input tokens: ['Larry', 'Page', 'was', 'considering', 'Stanford', 'for', 'grad', 'school', 'and', 'Sergey', 'Brin', ','] Output Confidence: 0.5366027 ; Subject: Larry Page ; Object: Stanford ; Confidence is lower than threshold confidence. Ignoring this. ========== Processed 5 / 24 sentences Processed 10 / 24 sentences Processed 15 / 24 sentences Processed 20 / 24 sentences Extracted annotations for 1 out of total 24 sentences Relations extracted from this website: 0 (Overall: 1) URL ( 8 / 10): https://facts.stanford.edu/alumni/ Fetching text from url ... Webpage length (num characters): 4834 Annotating the webpage using spacy... Extracted 17 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Processed 5 / 17 sentences Processed 10 / 17 sentences Processed 15 / 17 sentences Extracted annotations for 0 out of total 17 sentences Relations extracted from this website: 0 (Overall: 0) URL ( 9 / 10): http://ilpubs.stanford.edu/422/1/1999-66.pdf Fetching text from url ... Trimming webpage content from 273295 to 10000 characters Webpage length (num characters): 10000 Annotating the webpage using spacy... Extracted 38 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Processed 5 / 38 sentences Processed 10 / 38 sentences Processed 15 / 38 sentences Processed 20 / 38 sentences Processed 25 / 38 sentences Processed 30 / 38 sentences Processed 35 / 38 sentences Extracted annotations for 0 out of total 38 sentences Relations extracted from this website: 0 (Overall: 0) URL ( 10 / 10): https://graphics.stanford.edu/~dk/google_name_origin.html Fetching text from url ... Webpage length (num characters): 1815 Annotating the webpage using spacy... Extracted 12 sentences. Processing each sentence one by one to check for presence of right pair of named entity types; if so, will run the second pipeline ... Processed 5 / 12 sentences Processed 10 / 12 sentences Extracted annotations for 0 out of total 12 sentences Relations extracted from this website: 0 (Overall: 0) ================== ALL RELATIONS for per:schools_attended ( 10 ) ================= Confidence: 0.9907721 | Subject: Eugenia Brin | Object: Moscow State University Confidence: 0.9894984 | Subject: Brin | Object: the Department of Computer Science Confidence: 0.95533353 | Subject: Brin | Object: the University of Maryland Confidence: 0.9527388 | Subject: Brin | Object: Stanford University Confidence: 0.9513414 | Subject: Larry Page | Object: Stanford Confidence: 0.93284553 | Subject: Brin | Object: Paint Branch Montessori School Confidence: 0.9301687 | Subject: Page | Object: Stanford Confidence: 0.8560143 | Subject: Brin | Object: Bachelor of Science Confidence: 0.81827945 | Subject: Sergey Brin | Object: Stanford Confidence: 0.7757271 | Subject: Brin | Object: Stanford Total # of iterations = 1