Class: RDF::Turtle::Writer

Inherits:
Writer
  • Object
show all
Includes:
StreamingWriter, Util::Logger
Defined in:
vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb

Overview

A Turtle serialiser

Note that the natural interface is to write a whole graph at a time. Writing statements or Triples will create a graph to add them to and then serialize the graph.

The writer will add prefix definitions, and use them for creating @prefix definitions, and minting QNames

Examples:

Obtaining a Turtle writer class

RDF::Writer.for(:ttl)         #=> RDF::Turtle::Writer
RDF::Writer.for("etc/test.ttl")
RDF::Writer.for(file_name:       "etc/test.ttl")
RDF::Writer.for(file_extension:  "ttl")
RDF::Writer.for(content_type:    "text/turtle")

Serializing RDF graph into an Turtle file

RDF::Turtle::Writer.open("etc/test.ttl") do |writer|
  writer << graph
end

Serializing RDF statements into an Turtle file

RDF::Turtle::Writer.open("etc/test.ttl") do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an Turtle string

RDF::Turtle::Writer.buffer do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements to a string in streaming mode

RDF::Turtle::Writer.buffer(stream:  true) do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Creating @base and @prefix definitions in output

RDF::Turtle::Writer.buffer(base_uri:  "http://example.com/", prefixes:  {
    nil => "http://example.com/ns#",
    foaf:  "http://xmlns.com/foaf/0.1/"}
) do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Author:

Direct Known Subclasses

RDF::TriG::Writer

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from StreamingWriter

#stream_epilogue, #stream_prologue, #stream_statement

Constructor Details

#initialize(output = $stdout, options = {}) {|writer| ... } ⇒ Writer

Initializes the Turtle writer instance.

Parameters:

  • output (IO, File) (defaults to: $stdout)

    the output stream

  • options (Hash{Symbol => Object}) (defaults to: {})

    any additional options

Options Hash (options):

  • :encoding (Encoding) — default: Encoding::UTF_8

    the encoding to use on the output stream

  • :canonicalize (Boolean) — default: false

    whether to canonicalize literals when serializing

  • :prefixes (Hash) — default: Hash.new

    the prefix mappings to use (not supported by all writers)

  • :base_uri (#to_s) — default: nil

    the base URI to use when constructing relative URIs

  • :max_depth (Integer) — default: 3

    Maximum depth for recursively defining resources, defaults to 3

  • :standard_prefixes (Boolean) — default: false

    Add standard prefixes to @prefixes, if necessary.

  • :stream (Boolean) — default: false

    Do not attempt to optimize graph presentation, suitable for streaming large graphs.

  • :default_namespace (String) — default: nil

    URI to use as default namespace, same as prefixes[nil]

  • :unique_bnodes (Boolean) — default: false

    Use unique node identifiers, defaults to using the identifier which the node was originall initialized with (if any).

  • :literal_shorthand (Boolean) — default: true

    Attempt to use Literal shorthands for numbers and boolean values

Yields:

  • (writer)

    self

  • (writer)

Yield Parameters:

  • writer (RDF::Writer)
  • writer (RDF::Writer)

Yield Returns:

  • (void)


126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 126

def initialize(output = $stdout, options = {}, &block)
  @graph = RDF::Graph.new
  @uri_to_pname = {}
  @uri_to_prefix = {}
  options = {literal_shorthand: true}.merge(options)
  super do
    reset
    if block_given?
      case block.arity
        when 0 then instance_eval(&block)
        else block.call(self)
      end
    end
  end
end

Instance Attribute Details

#graphGraph

Returns Graph of statements serialized

Returns:

  • (Graph)

    Graph of statements serialized



64
65
66
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 64

def graph
  @graph
end

Class Method Details

.optionsObject

Writer options



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 69

def self.options
  super + [
    RDF::CLI::Option.new(
      symbol: :max_depth,
      datatype: Integer,
      on: ["--max-depth DEPTH"],
      description: "Maximum depth for recursively defining resources, defaults to 3.") {true},
    RDF::CLI::Option.new(
      symbol: :stream,
      datatype: TrueClass,
      on: ["--stream"],
      description: "Do not attempt to optimize graph presentation, suitable for streaming large graphs.") {true},
    RDF::CLI::Option.new(
      symbol: :default_namespace,
      datatype: RDF::URI,
      on: ["--default-namespace URI", :REQUIRED],
      description: "URI to use as default namespace, same as prefixes.") {|arg| RDF::URI(arg)},
    RDF::CLI::Option.new(
      symbol: :literal_shorthand,
      datatype: FalseClass,
      on: ["--no-literal-shorthand"],
      description: "Do not ttempt to use Literal shorthands fo numbers and boolean values.") {false},
  ]
end

Instance Method Details

#blankNodePropertyList?(resource, position) ⇒ Boolean (protected)

Can subject be represented as a blankNodePropertyList?

Returns:

  • (Boolean)


458
459
460
461
462
463
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 458

def blankNodePropertyList?(resource, position)
  resource.node? &&
    !is_valid_list?(resource) &&
    (!is_done?(resource) || position == :subject) &&
    ref_count(resource) == (position == :object ? 1 : 0)
end

#bump_reference(resource) ⇒ Integer (protected)

Increase the reference count of this resource

Parameters:

  • resource (RDF::Resource)

Returns:

  • (Integer)

    resulting reference count



482
483
484
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 482

def bump_reference(resource)
  @references[resource] = ref_count(resource) + 1
end

#format_literal(literal, options = {}) ⇒ String

Returns the N-Triples representation of a literal.

Parameters:

Returns:



275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 275

def format_literal(literal, options = {})
  case literal
  when RDF::Literal
    case @options[:literal_shorthand] && literal.valid? ? literal.datatype : false
    when RDF::XSD.boolean, RDF::XSD.integer, RDF::XSD.decimal
      literal.canonicalize.to_s
    when RDF::XSD.double
      literal.canonicalize.to_s.sub('E', 'e')  # Favor lower case exponent
    else
      text = quoted(literal.value)
      text << "@#{literal.language}" if literal.has_language?
      text << "^^#{format_uri(literal.datatype)}" if literal.has_datatype?
      text
    end
  else
    quoted(literal.to_s)
  end
end

#format_node(node, options = {}) ⇒ String

Returns the Turtle representation of a blank node.

Parameters:

Returns:



312
313
314
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 312

def format_node(node, options = {})
  options[:unique_bnodes] ? node.to_unique_base : node.to_base
end

#format_uri(uri, options = {}) ⇒ String

Returns the Turtle representation of a URI reference.

Parameters:

Returns:



300
301
302
303
304
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 300

def format_uri(uri, options = {})
  md = uri.relativize(base_uri)
  log_debug("relativize") {"#{uri.to_ntriples} => #{md.inspect}"} if md != uri.to_s
  md != uri.to_s ? "<#{md}>" : (get_pname(uri) || "<#{uri}>")
end

#get_pname(resource) ⇒ String?

Return a QName for the URI, or nil. Adds namespace of QName to defined prefixes

Parameters:

  • resource (RDF::Resource)

Returns:

  • (String, nil)

    value to use to identify URI



207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 207

def get_pname(resource)
  case resource
  when RDF::Node
    return options[:unique_bnodes] ? resource.to_unique_base : resource.to_base
  when RDF::URI
    uri = resource.to_s
  else
    return nil
  end

  pname = case
  when @uri_to_pname.has_key?(uri)
    return @uri_to_pname[uri]
  when u = @uri_to_prefix.keys.sort_by {|uu| uu.length}.reverse.detect {|uu| uri.index(uu.to_s) == 0}
    # Use a defined prefix
    prefix = @uri_to_prefix[u]
    unless u.to_s.empty?
      prefix(prefix, u) unless u.to_s.empty?
      log_debug("get_pname") {"add prefix #{prefix.inspect} => #{u}"}
      uri.sub(u.to_s, "#{prefix}:")
    end
  when @options[:standard_prefixes] && vocab = RDF::Vocabulary.each.to_a.detect {|v| uri.index(v.to_uri.to_s) == 0}
    prefix = vocab.__name__.to_s.split('::').last.downcase
    @uri_to_prefix[vocab.to_uri.to_s] = prefix
    prefix(prefix, vocab.to_uri) # Define for output
    log_debug("get_pname") {"add standard prefix #{prefix.inspect} => #{vocab.to_uri}"}
    uri.sub(vocab.to_uri.to_s, "#{prefix}:")
  else
    nil
  end

  # Make sure pname is a valid pname
  if pname
    md = Terminals::PNAME_LN.match(pname) || Terminals::PNAME_NS.match(pname)
    pname = nil unless md.to_s.length == pname.length
  end

  @uri_to_pname[uri] = pname
end

#indent(modifier = 0) ⇒ String (protected)

Returns indent string multiplied by the depth

Parameters:

  • modifier (Integer) (defaults to: 0)

    Increase depth by specified amount

Returns:

  • (String)

    A number of spaces, depending on current depth



430
431
432
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 430

def indent(modifier = 0)
  " " * (@options.fetch(:log_depth, log_depth) * 2 + modifier)
end

#is_done?(subject) ⇒ Boolean (protected)

Returns:

  • (Boolean)


486
487
488
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 486

def is_done?(subject)
  @serialized.include?(subject)
end

#order_subjectsArray<Resource> (protected)

Order subjects for output. Override this to output subjects in another order.

Uses #top_classes and #base_uri.

Returns:

  • (Array<Resource>)

    Ordered list of subjects



340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 340

def order_subjects
  seen = {}
  subjects = []

  # Start with base_uri
  if base_uri && @subjects.keys.include?(base_uri)
    subjects << RDF::URI(base_uri)
    seen[RDF::URI(base_uri)] = true
  end

  # Add distinguished classes
  top_classes.each do |class_uri|
    graph.query(predicate:  RDF.type, object:  class_uri).
      map {|st| st.subject}.
      sort.
      uniq.
      each do |subject|
      log_debug("order_subjects") {subject.to_ntriples}
      subjects << subject
      seen[subject] = true
    end
  end

  # Mark as seen lists that are part of another list
  @lists.values.map(&:statements).
    flatten.each do |st|
      seen[st.object] if @lists.has_key?(st.object)
    end

  # List elements should not be targets for top-level serialization
  list_elements = @lists.values.map(&:to_a).flatten.compact

  # Sort subjects by resources over bnodes, ref_counts and the subject URI itself
  recursable = (@subjects.keys - list_elements).
    select {|s| !seen.include?(s)}.
    map {|r| [r.node? ? 1 : 0, ref_count(r), r]}.
    sort

  subjects + recursable.map{|r| r.last}
end

#predicate_orderArray<URI> (protected)

Defines order of predicates to to emit at begninning of a resource description. Defaults to \[rdf:type, rdfs:label, dc:title\]

Returns:



334
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 334

def predicate_order; [RDF.type, RDF::RDFS.label, RDF::URI("http://purl.org/dc/terms/title")]; end

#preprocessObject (protected)

Perform any preprocessing of statements required



382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 382

def preprocess
  # Load defined prefixes
  (@options[:prefixes] || {}).each_pair do |k, v|
    @uri_to_prefix[v.to_s] = k
  end

  prefix(nil, @options[:default_namespace]) if @options[:default_namespace]

  case
  when @options[:stream]
  else
    @options[:prefixes] = {}  # Will define actual used when matched

    @graph.each {|statement| preprocess_statement(statement)}
  end
end

#preprocess_statement(statement) ⇒ Object (protected)

Perform any statement preprocessing required. This is used to perform reference counts and determine required prefixes.

Parameters:



402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 402

def preprocess_statement(statement)
  #log_debug("preprocess") {statement.to_ntriples}
  bump_reference(statement.object)
  # Count properties of this subject
  (@subjects[statement.subject] ||= {})[statement.predicate] ||= 0
  @subjects[statement.subject][statement.predicate] += 1

  # Collect lists
  if statement.predicate == RDF.first
    l = RDF::List.new(subject: statement.subject, graph: graph)
    @lists[statement.subject] = l if l.valid?
  end

  if statement.object == RDF.nil || statement.subject == RDF.nil
    # Add an entry for the list tail
    @lists[RDF.nil] ||= RDF::List[]
  end

  # Pre-fetch pnames, to fill prefixes
  get_pname(statement.subject)
  get_pname(statement.predicate)
  get_pname(statement.object)
  get_pname(statement.object.datatype) if statement.object.literal? && statement.object.datatype
end

#prop_count(subject) ⇒ Integer (protected)

Return the number of statements having this resource as a subject other than for list properties

Returns:



467
468
469
470
471
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 467

def prop_count(subject)
  @subjects.fetch(subject, {}).
    reject {|k, v| [RDF.type, RDF.first, RDF.rest].include?(k)}.
    values.reduce(:+) || 0
end

#quoted(string) ⇒ String (protected)

Use single- or multi-line quotes. If literal contains \t, \n, or \r, use a multiline quote, otherwise, use a single-line

Parameters:

Returns:



448
449
450
451
452
453
454
455
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 448

def quoted(string)
  if string.to_s.match(/[\t\n\r]/)
    string = string.gsub('\\', '\\\\\\\\').gsub('"""', '\\"""')
    %("""#{string}""")
  else
    "\"#{escaped(string)}\""
  end
end

#ref_count(resource) ⇒ Integer (protected)

Return the number of times this node has been referenced in the object position

Returns:



475
476
477
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 475

def ref_count(resource)
  @references.fetch(resource, 0)
end

#resetObject (protected)

Reset internal helper instance variables



435
436
437
438
439
440
441
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 435

def reset
  @lists = {}

  @references = {}
  @serialized = {}
  @subjects = {}
end

#sort_properties(properties) ⇒ Array<String>

Take a hash from predicate uris to lists of values. Sort the lists of values. Return a sorted list of properties.

Parameters:

  • properties (Hash{String => Array<Resource>})

    A hash of Property to Resource mappings

Returns:

  • (Array<String>)

    ] Ordered list of properties. Uses predicate_order.



251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 251

def sort_properties(properties)
  # Make sorted list of properties
  prop_list = []

  predicate_order.each do |prop|
    next unless properties[prop.to_s]
    prop_list << prop.to_s
  end

  properties.keys.sort.each do |prop|
    next if prop_list.include?(prop.to_s)
    prop_list << prop.to_s
  end

  log_debug("sort_properties") {prop_list.join(', ')}
  prop_list
end

#start_documentObject (protected)

Output @base and @prefix definitions



318
319
320
321
322
323
324
325
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 318

def start_document
  @output.write("#{indent}@base <#{base_uri}> .\n") unless base_uri.to_s.empty?

  log_debug("start_document") {prefixes.inspect}
  prefixes.keys.sort_by(&:to_s).each do |prefix|
    @output.write("#{indent}@prefix #{prefix}: <#{prefixes[prefix]}> .\n")
  end
end

#subject_done(subject) ⇒ Object (protected)

Mark a subject as done.



491
492
493
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 491

def subject_done(subject)
  @serialized[subject] = true
end

#top_classesArray<URI> (protected)

Defines rdf:type of subjects to be emitted at the beginning of the graph. Defaults to rdfs:Class

Returns:



329
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 329

def top_classes; [RDF::RDFS.Class]; end

#write_epilogue

This method returns an undefined value.

Outputs the Turtle representation of all stored triples.

See Also:



174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 174

def write_epilogue
  case
  when @options[:stream]
    stream_epilogue
  else
    @max_depth = @options[:max_depth] || 3

    self.reset

    log_debug("\nserialize") {"graph: #{@graph.size}"}

    preprocess

    start_document

    # Remove lists that are referenced and have non-list properties;
    # these are legal, but can't be serialized as lists
    @lists.reject! do |node, list|
      ref_count(node) > 0 && prop_count(node) > 0
    end

    order_subjects.each do |subject|
      unless is_done?(subject)
        statement(subject)
      end
    end
  end
  super
end

#write_prologue

This method returns an undefined value.

Write out declarations



160
161
162
163
164
165
166
167
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 160

def write_prologue
  case
  when @options[:stream]
    stream_prologue
  else
  end
  super
end

#write_triple(subject, predicate, object)

This method returns an undefined value.

Adds a triple to be serialized

Parameters:

  • subject (RDF::Resource)
  • predicate (RDF::URI)
  • object (RDF::Value)


148
149
150
151
152
153
154
155
# File 'vendor/bundler/ruby/2.5.0/bundler/gems/rdf-turtle-fb10116ce928/lib/rdf/turtle/writer.rb', line 148

def write_triple(subject, predicate, object)
  statement = RDF::Statement.new(subject, predicate, object)
  if @options[:stream]
    stream_statement(statement)
  else
    @graph.insert(statement)
  end
end