Managing {} indexing in SUBSREF/SUBSASGN overloads.

Question

Cedric el 22 de Mayo de 2015

3
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/218006-managing-indexing-in-subsref-subsasgn-overloads

Editada: Cedric el 29 de Mayo de 2015

Dear all,

The main question that I would like to discuss (I don't have the answer) is: is it possible to build a class which supports {} indexing like a cell array (by overloading SUBSREF, SUBSASGN, etc), and which can be nested (in particular mixed with structs). I am building a series of examples to illustrate issues that arise when we try to build such a class.

Assume that we need to develop a class that wraps a cell array, and adds extra features (among other things, subclass handle). As mentioned in [ Subclassing MATLAB Built-In Types ], we cannot subclass the cell built-in class. One way to overcome this is to implement the following:

 classdef MyCell < handle
    properties
        array
    end
    methods
        function obj = MyCell( varargin )
            if isempty( varargin )
                obj.array = {} ;
            else
                obj.array = cell( varargin{:} ) ;
            end
        end
    end
 end

This works great as long as we are willing to deal with the array property:

 >> mc = MyCell( 1, 3 ) ;
 >> mc.array(:) = {30, 'Temperature', {50,60}} ;
 >> mc.array{3}{2}
 ans =
    60

Next we want to manage indexing, and to "transfer" the {} indexing on a MyCell object to the internal array property. We try the following first, overloading SUBSREF and SUBSASGN (I leave out in purpose the management of cases that are not relevant to my point, e.g. dot and () indexing, overload of NUMEL, END, etc):

 classdef MyCell < handle
    properties
        array
    end
    methods
        function obj = MyCell( varargin )
            if isempty( varargin )
                obj.array = {} ;
            else
                obj.array = cell( varargin{:} ) ;
            end
        end
        function [varargout] = subsref( obj, S )
            if S(1).type(1) == '{'            
                [varargout{1:nargout}] = builtin( 'subsref', obj.array, S ) ;
            else
                [varargout{1:nargout}] = builtin( 'subsref', obj, S ) ;
            end
        end
        function obj = subsasgn( obj, S, varargin )
            if S(1).type(1) == '{'
                obj.array = builtin( 'subsasgn', obj.array, S, varargin{:} ) ;
            else
                obj = builtin( 'subsasgn', obj, S, varargin{:} ) ;
            end
        end
    end
 end

This seems to be working well:

 >> mc = MyCell() ;
 >> mc{1} = 8 ;
 >> mc{2}.a = 5 ;
 >> mc{2}.b = {7,8,9} ;
 >> mc
 mc = 
    MyCell with properties:
       array: {[8]  [1x1 struct]}
 >> mc{2}.b{3}
 ans =
     9

Now what happens if we try to nest MyCell objects?

 >> mc2 = MyCell() ;  mc2{1} = 8 ;
 >> mc1 = MyCell() ;  mc1{1} = 9 ;  mc1{2} = mc2 ;
 >> mc1
 mc1 = 
    MyCell with properties:
       array: {[9]  [1x1 MyCell]}
 >> mc1{2}        % Access content of cell 2 of mc1.array implicitly, which is mc2.
 ans = 
    MyCell with properties:
       array: {[8]}
 >> mc1{2}{1}     % Access content of cell 1 of mc2.array implicitly, through mc1{2}.
 Error using subsref
 Cell contents reference from a non-cell array object.
 Error in MyCell/subsref (line 17)
                [varargout{1:nargout}] = builtin( 'subsref', obj.array, S ) ;

Why this error? If we run, instead:

 >> mc1{2}.array{1}
 ans =
      8

it works. So it seems that the second index (the one that is meant to index mc2 stored in mc1{2}), is not managed by the MyCell class overload of SUBSREF, but by some built-in. This is documented in [ subsref ], in the "Tips" section: Within a class's own methods, MATLAB calls the built-in subsref, not the class defined subsref. This behavior enables to use the default subsref behavior when defining specialized indexing for your class. It is also written in [ Object Array Indexing ] in the "Built-in subsref and .." section. that we can call explicitly the overload/class-defined SUBSREF: .. = subsref( obj, S ) .

Understanding this, we can try to differentiate cases and call explicitly either the built-in, or the overload:

        function [varargout] = subsref( obj, S )
            if S(1).type(1) == '{'            
                if numel( S ) == 1
                    % One level indexing -> call built-in on the array property.
                    [varargout{1:nargout}] = builtin( 'subsref', obj.array, S ) ;
                else
                    % Multiple levels indexing -> extract content of array first using
                    % first level, and then delegate the rest of the indexing to the 
                    % content.
                    tmp = builtin( 'subsref', obj.array, S(1) ) ;
                    [varargout{1:nargout}] = subsref( tmp, S(2:end) ) ;
                end
            else
                [varargout{1:nargout}] = builtin( 'subsref', obj, S ) ;
            end
        end

With this update of SUBSREF, we finally get:

 >> mc1{2}{1}
 ans =
     8

which works. There are two issues that I would like to discuss though, which indicate that I still don't know what is going on. One about SUBSASGN (which is more complicated), and another still with SUBSREF. I will discuss the later so the thread can be focused on SUBSREF at first.

What happens now if we nest MyCell objects and structs ?

 >> mc2 = MyCell() ;  mc2{1} = 8 ;
 >> mc1 = MyCell() ;  mc1{1} = 9 ;  mc1{2}.a = mc2 ;
 >> mc1
 mc1 = 
    MyCell with properties:
       array: {[9]  [1x1 struct]}
 >> mc1{2}
 ans = 
    a: [1x1 MyCell]
 >> mc1{2}.a
 ans = 
    MyCell with properties:
       array: {[8]}
 >> mc1{2}.a{1}
 Error using subsref
 Cell contents reference from a non-cell array object.
 Error in MyCell/subsref (line 22)
                    [varargout{1:nargout}] = subsref( tmp, S(2:end) ) ;

QUESTION #1 : What happens in the example above? I thought that when we are evaluating mc1{2}.a{1}, the first level of indexing would extract the content of mc1.array{2} through the call to BUILTIN, which is a struct:

 K>> class( tmp )
 ans =
     struct

and then apply SUBSREF with the rest of the indexing to this struct. This should mean that the SUBSREF of the built-in struct class is called, and then the overloaded SUBSREF of mc2. This (cascading) doesn't happen though, because the debugger doesn't re-enter the class-defined SUBSREF. Hence my question: is there some kind of parsing that happens in the first overloaded SUBSREF and that fails determing that the last {} indexing doesn't apply to a cell, but to a MyCell object with an overloaded SUBSREF? This seems to be what is suggested by Andrew Janke in [ here ] (lookup the keyword nonintuitive).

EDIT 05/26/2015 @ 8:22pm - Titus proposes a way to overcome this in his answer below, and I wrote a comparable solution over the week end:

        function [varargout] = subsref( obj, S )            
            if numel( S ) == 1
                if S(1).type(1) == '.'
                    [varargout{1:nargout}] = builtin( 'subsref', obj, S ) ;
                else
                    [varargout{1:nargout}] = builtin( 'subsref', obj.array, S ) ;
                end
            else
                tmp = obj ;
                for k = 1 : numel( S ) - 1
                    if isa( tmp, 'MyCell' )
                        tmp = subsref( tmp, S(k) ) ;
                    else
                        tmp = builtin( 'subsref', tmp, S(k) ) ;
                    end
                end
                [varargout{1:nargout}] = subsref( tmp, S(end) ) ;
            end
        end

Yet, this is highly unsatisfying and heavy, and it indicates that I don't fully understand what happens internally.

QUESTION #2 : is MATLAB analyzing the whole substruct and performing a kind of "iscell" test (at a more fundamental level, because I tried to overload ISCELL and CLASS, and it didn't work) each time there is a {} indexing type, which would be wrong in a system that allows overloading SUBSREF/ASGN, or is it more subtle than that?

Following the same logic and counter-examples, we start by implementing SUBSASGN as follows:

        function obj = subsasgn( obj, S, varargin )
            if S(1).type(1) == '{'
                obj.array = builtin( 'subsasgn', obj.array, S, varargin{:} ) ;
            else
                obj = builtin( 'subsasgn', obj, S, varargin{:} ) ;
            end
        end

but it fails for the same reason. Assuming that MyCell objects won't be nested, we can build a two steps approach, where we first SUBSASGN all the indexing chain from element 2 to the end, store it in a temporary variable and then SUBSASGN the first level of indexing (=> applied to the MyCell object). But again, this fails if MyCells objects can be nested. Following the same track as for SUBSREF, we finally build two solutions: the first performs temporary assignments (similar the the temporary refs above):

        function obj = subsasgn( obj, S, varargin )
            if numel( S ) == 1
                if S(1).type(1) == '.' || ~isa( obj, 'MyCell' )
                    obj = builtin( 'subsasgn', obj, S, varargin{:} ) ;
                else
                    obj.array = builtin( 'subsasgn', obj.array, S, varargin{:} ) ;
                end
            else
                tmp1 = subsref( obj, S(1:end-1) ) ;
                tmp1 = subsasgn( tmp1, S(end), varargin{:} ) ;
                for k = numel( S ) - 2 : -1 : 1
                    tmp2 = subsref( obj, S(1:k) ) ;
                    tmp2 = subsasgn( tmp2, S(k+1), tmp1 ) ;
                    tmp1 = tmp2 ;
                end
            end
        end

and the second where we work on the S substruct before performing a simple call to BUILTIN:

        function obj = subsasgn( obj, S, varargin )
            if S(1).type(1) == '.' && strcmp( S(1).subs, 'array' )
                Sext(1:2) = S(1:2) ;
                k_start = 3 ;
            else
                Sext(2) = S(1) ;
                Sext(1) = substruct( '.', 'array' ) ;
                k_start = 2 ;
            end
            for k = k_start : length( S )
                if isa( subsref( obj, S(1:k-1) ), 'MyCell' )
                    if ~( S(k).type(1) == '.' && strcmp( S(k).subs, 'array' ) )
                        Sext(end+1) = substruct( '.', 'array' ) ;
                    end
                end
                Sext(end+1) = S(k) ;
            end
            obj = builtin( 'subsasgn', obj, Sext, varargin{:} ) ;
        end

The rational for the latter is that indexing nested MyCell objects through their array property is perfectly well supported by built-in mechanisms, so we can extend the S substruct by adding an explicit substruct( '.', 'array' ) entry before each direct {} indexing of a MyCell object.

Yet, both methods are pretty slow because they involve calling SUBSREF in a loop for testing a class at each level of the indexing (and SUBSREF itself involves a loop). This could be slightly improved by looping over occurrences of the {} type of indexing only, but it would not bring much (at least no extra understanding of the internals).

In any case, this is heavy, unsatisfying, and it shows that I still do not understand what happens internally. I would therefore highly appreciate any clarification/insight about my remark/question #2 above!

Best regards,

Cedric

Additional remarks

For a long time I have been believing that it was an issue with NUMEL (related to the issue mentioned here by Matt). But lately I could solve most of my issues with overloading NUMEL thank to a better doc and a lot of testing. The issue mentioned in this thread is, as far as I am concerned, not related to NUMEL (but to some parsing as mentioned in Q#2).
The final versions of SUBSREF and SUBSASGN successfully pass my (small) "test suite", so it is "functional" if speed is not critical.
Speed-wise.. it is slow: when involved at various levels in a rather deep structure, the profiler shows that indexing is ~100-200 times slower than indexing a similar structure made of fundamental classes only.
While I would still like to understand why we have to implement such a mess when there is a priori no reason for that (if it's parsing, why is it implemented this way?), the approach is suitable for my needs: to implement a user-oriented modeling structure which does not need to be fast as all objects stored in the final structure are instances of sub-classes of the handle class (so the structure store references, and iterative intensive computations are never performed using an indexing through the whole structure).

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Titus Edelhofer el 26 de Mayo de 2015

2
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/218006-managing-indexing-in-subsref-subsasgn-overloads#answer_180490

Abrir en MATLAB Online

Hi Cedric,

very tricky question, indeed. I gave it a try and came up with this subsref:

function [varargout] = subsref( obj, S )
if S(1).type(1) == '{'            
  if numel( S ) == 1
    % One level indexing -> call built-in on the array property.
    [varargout{1:nargout}] = builtin( 'subsref', obj.array, S ) ;
  else
    % Multiple levels indexing -> do one by one. Probably you won't
    % even need to distinguish between numel(S) being 1 or gt 1 but
    % I did not try ...
    tmp = obj;
    for i=1:length(S)-1
      if isa(tmp, 'MyCell')
        tmp = builtin('subsref', tmp.array, S(i));
      else
        tmp = subsref(tmp, S(i));
      end
    end
      [varargout{1:nargout}] = subsref( tmp, S(end) ) ;
    end
  else
    [varargout{1:nargout}] = builtin( 'subsref', obj, S ) ;
  end
end

For your case of MyCell and struct it works, but I'm 99% sure that I missed a lot of other allowed/possible indexing calls ...

Titus

2 comentarios
Mostrar NingunoOcultar Ninguno

Titus Edelhofer el 26 de Mayo de 2015

One thing I forgot: I guess my answer clearly shows why I usually try not to overload subsref and subsasgn ;-)

Cedric el 26 de Mayo de 2015

Editada: Cedric el 26 de Mayo de 2015

Hello Titus,

First, thank you for your answer! It is a tricky topic, and I was almost certain that no one would even read the question ;-)

I implemented both the FOR loop approach over the week end and another block-based approach which avoids the full loop (defines block boundaries based on a test on S.type).

Yet, this is highly unsatisfying and heavy, and it indicates that I don't fully understand what happens internally. Is MATLAB analyzing the whole substruct and performing a kind of "iscell" test (at a more fundamental level, because I tried to overload ISCELL and it didn't work) each time there is a {} indexing type, which would be wrong in a system that allows overloading SUBSREF/ASGN, or is it more subtle than that?

I up-voted your answer, but I will try to leave the thread open to see if anyone else (Sean, Matt, or any other happy oop-er ;-)) can provide more insights about the internals and the reasons why it was implemented this way.

Thanks!

Cedric

Iniciar sesión para comentar.

Managing {} indexing in SUBSREF/SUBSASGN overloads.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Managing {} indexing in SUBSREF/SUBSASGN overloads.

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno